Re: [GIT] Networking
On Sun, Oct 28, 2018 at 7:46 PM David Miller wrote: > > Please pull, thanks a lot! Pulled, Linus
Re: [GIT] Networking
On Fri, May 11, 2018 at 5:10 PM David Millerwrote: > I guess this is my reward for trying to break the monotony of > pull requests :-) I actually went back and checked a few older pull requests to see if this had been going on for a while and I just hadn't noticed. It just took me by surprise :^p Linus
Re: [GIT] Networking
From: Linus TorvaldsDate: Fri, 11 May 2018 14:25:59 -0700 > David, is there something you want to tell us? > > Drugs are bad, m'kay.. I guess this is my reward for trying to break the monotony of pull requests :-)
Re: [GIT] Networking
David, is there something you want to tell us? Drugs are bad, m'kay.. Linus On Fri, May 11, 2018 at 2:00 PM David Millerwrote: > "from Kevin Easton", "Thanks to Bhadram Varka", "courtesy of Gustavo A. > R. Silva", "To Eric Dumazet we are most grateful for this fix", "This > fix from YU Bo, we do appreciate", "Once again we are blessed by the > honorable Eric Dumazet with this fix", "This fix is bestowed upon us by > Andrew Tomt", "another great gift from Eric Dumazet", "to Hangbin Liu we > give thanks for this", "Paolo Abeni, he gave us this", "thank you Moshe > Shemesh", "from our good brother David Howells", "Daniel Juergens, > you're the best!", "Debabrata Benerjee saved us!", "The ship is now > water tight, thanks to Andrey Ignatov", "from Colin Ian King, man we've > got holes everywhere!", "Jiri Pirko what would we do without you!
Re: [GIT] Networking
Sorry, this is a dup of the bug fix pull request from last week. I'll send you the right one.
Re: [GIT] Networking
From: Daniel BorkmannDate: Wed, 15 Nov 2017 23:15:20 +0100 > On 11/15/2017 09:19 PM, Linus Torvalds wrote: >> On Wed, Nov 15, 2017 at 3:33 AM, David Miller wrote: >>> >>> Highlights: >> >> Lowlights: >> >> 1) it duplicated a commit from the hrtimer tree, which had been >> cleaned up and rewritten, but then merging the second copy of the >> commit re-introduced the bad code that had been cleaned up. >> >> I'm talking about commits >> >> - 7d9285e82db5: >> perf/bpf: Extend the perf_event_read_local() interface, a.k.a. >> "bpf: perf event change needed for subsequent bpf helpers" >> - 97562633bcba >> bpf: perf event change needed for subsequent bpf helpers >> >> where apparently there was no discussion between the groups about the >> subsequent changes. >> >> And this must have shown up in linux-next as a conflict, but no >> mention of it from either the perf event tree or the networking tree >> merge. >> >> Although it is of course possible that depending on merge order, the >> problem never showed up in next. > > Sorry about that, it was discussed that the patch in [1] would get > routed through net-next and again cherry-picked from tracing folks > due to conflicting changes in perf event tree that were being worked > on to avoid later merge conflicts - clearly that didn't give the > desired result. > > There was a subsequent discussion in [2] but not sure if cherry-picking > 0d3d73aac2ff ("perf/core: Rewrite event timekeeping") into net-next > would have made it better or worse. We'll have a bpf sub-tree up and > running soon for the next development cycle that can be pulled from > by different parties when needed; potentially this could reduce such > conflicts between trees in future. Sorry for the trouble. Yeah, sorry about all of this. I had hoped that since the patch was being appied to both trees in order to avoid merge problems, no modifications would have been made to the change at either end. This obviously didn't happen. I also didn't communicate the issue to you clearly in the pull request, and for this I apologize. As Daniel says, we realize that bpf is breaching multiple subsystems more and more so over time, and we hope a bpf GIT tree will help alleviate this moving forward. Thanks!
Re: [GIT] Networking
On 11/15/2017 09:19 PM, Linus Torvalds wrote: > On Wed, Nov 15, 2017 at 3:33 AM, David Millerwrote: >> >> Highlights: > > Lowlights: > > 1) it duplicated a commit from the hrtimer tree, which had been > cleaned up and rewritten, but then merging the second copy of the > commit re-introduced the bad code that had been cleaned up. > > I'm talking about commits > > - 7d9285e82db5: > perf/bpf: Extend the perf_event_read_local() interface, a.k.a. > "bpf: perf event change needed for subsequent bpf helpers" > - 97562633bcba > bpf: perf event change needed for subsequent bpf helpers > > where apparently there was no discussion between the groups about the > subsequent changes. > > And this must have shown up in linux-next as a conflict, but no > mention of it from either the perf event tree or the networking tree > merge. > > Although it is of course possible that depending on merge order, the > problem never showed up in next. Sorry about that, it was discussed that the patch in [1] would get routed through net-next and again cherry-picked from tracing folks due to conflicting changes in perf event tree that were being worked on to avoid later merge conflicts - clearly that didn't give the desired result. There was a subsequent discussion in [2] but not sure if cherry-picking 0d3d73aac2ff ("perf/core: Rewrite event timekeeping") into net-next would have made it better or worse. We'll have a bpf sub-tree up and running soon for the next development cycle that can be pulled from by different parties when needed; potentially this could reduce such conflicts between trees in future. Sorry for the trouble. Thanks, Daniel [1] https://patchwork.ozlabs.org/patch/821919/ [2] https://lkml.org/lkml/2017/11/1/53
Re: [GIT] Networking
On Wed, Nov 15, 2017 at 3:33 AM, David Millerwrote: > > Highlights: Lowlights: 1) it duplicated a commit from the hrtimer tree, which had been cleaned up and rewritten, but then merging the second copy of the commit re-introduced the bad code that had been cleaned up. I'm talking about commits - 7d9285e82db5: perf/bpf: Extend the perf_event_read_local() interface, a.k.a. "bpf: perf event change needed for subsequent bpf helpers" - 97562633bcba bpf: perf event change needed for subsequent bpf helpers where apparently there was no discussion between the groups about the subsequent changes. And this must have shown up in linux-next as a conflict, but no mention of it from either the perf event tree or the networking tree merge. Although it is of course possible that depending on merge order, the problem never showed up in next. Linus
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 10:40 PM, Luca Coelhowrote: > > This patch is not very important (unless you really like blinking lights > -- maybe I'll change my mind when the holidays approach :P). so it is > fine if you just want to revert it for now. > > In any case, I'll send a patch fixing this problem soon. No need to revert if we can get this fixed quickly enough. I'll leave the fw-31 on my laptop, so that I can continue to use it for now. Thanks, Linus
Re: [GIT] Networking
On Thu, 2017-09-07 at 05:04 +, Coelho, Luciano wrote: > On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano > >wrote: > > > > > > This seems to be a problem with backwards-compatibility with FW version > > > 27. We are now in version 31[1] and upgrading will probably fix that. > > > > I can confirm that fw version 31 works. > > Great, so I know for sure that this is a backwards-compatibility issue > with the FW API. > > > > > But obviously the driver should not fail miserably like this with > > > version 27, because it claims to support it still. > > > > Not just "claims to support it", but if it's what is shipped with a > > fairly recent distro like an up-to-date version of F26, I would really > > hope that the driver can still work with it. > > I totally agree, we support a bunch of older versions for that exact > reason. We just don't really test all the supported versions very > often. We should probably change that. > > I'll make sure it still works with version 27. Okay, I found the offending patch: commit 7089ae634c50544b29b31faf1a751e8765c8de3b Author: Johannes Berg AuthorDate: Wed Jun 28 16:19:49 2017 +0200 Commit: Luca Coelho CommitDate: Wed Aug 9 09:15:32 2017 +0300 iwlwifi: mvm: use firmware LED command where applicable On devices starting from 8000 series, the host can no longer toggle the LED through the CSR_LED_REG register, but must do it via the firmware instead. Add support for this. Note that this means that the LED cannot be turned on while the firmware is off, so using an arbitrary LED trigger may not work as expected. Fixes: 503ab8c56ca0 ("iwlwifi: Add 8000 HW family support") Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Reverting it solves the problem. We introduced a new command to control the LED lights and assumed it was available in older FW versions as well, which turned out not to be the case. This patch is not very important (unless you really like blinking lights -- maybe I'll change my mind when the holidays approach :P). so it is fine if you just want to revert it for now. In any case, I'll send a patch fixing this problem soon. -- Cheers, Luca.
Re: [GIT] Networking
On Wed, 2017-09-06 at 21:57 -0700, Linus Torvalds wrote: > On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Luciano >wrote: > > > > This seems to be a problem with backwards-compatibility with FW version > > 27. We are now in version 31[1] and upgrading will probably fix that. > > I can confirm that fw version 31 works. Great, so I know for sure that this is a backwards-compatibility issue with the FW API. > > But obviously the driver should not fail miserably like this with > > version 27, because it claims to support it still. > > Not just "claims to support it", but if it's what is shipped with a > fairly recent distro like an up-to-date version of F26, I would really > hope that the driver can still work with it. I totally agree, we support a bunch of older versions for that exact reason. We just don't really test all the supported versions very often. We should probably change that. I'll make sure it still works with version 27. -- Cheers, Luca.
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 9:11 PM, Coelho, Lucianowrote: > > This seems to be a problem with backwards-compatibility with FW version > 27. We are now in version 31[1] and upgrading will probably fix that. I can confirm that fw version 31 works. > But obviously the driver should not fail miserably like this with > version 27, because it claims to support it still. Not just "claims to support it", but if it's what is shipped with a fairly recent distro like an up-to-date version of F26, I would really hope that the driver can still work with it. > I'm looking into this now and will provide a fix asap. Thanks, Linus
Re: [GIT] Networking
On Wed, 2017-09-06 at 16:27 -0700, Linus Torvalds wrote: > This pull request completely breaks Intel wireless for me. > > This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a). > > That remains a very standard Intel machine with absolutely zero odd > things going on. > > The firmware is iwlwifi-8000C-28.ucode from > iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports > > iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm > > the thing starts acting badly with this: > > iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04 > iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004 > iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84 > iwlwifi :3a:00.0: Microcode SW error detected. Restarting 0x200. > iwlwifi :3a:00.0: Start IWL Error Log Dump: > iwlwifi :3a:00.0: Status: 0x0100, count: 6 > iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0 > iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND > iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0 > ... > iwlwifi :3a:00.0: 0x | isr status reg > ieee80211 phy0: Hardware restart was requested > iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD This seems to be a problem with backwards-compatibility with FW version 27. We are now in version 31[1] and upgrading will probably fix that. But obviously the driver should not fail miserably like this with version 27, because it claims to support it still. I'm looking into this now and will provide a fix asap. [1] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/iwlwifi-8000C-31.ucode -- Cheers, Luca.
Re: [GIT] Networking
On Wed, Sep 6, 2017 at 4:27 PM, Linus Torvaldswrote: > > The firmware is iwlwifi-8000C-28.ucode from > iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports > > iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm And when I said "iwlwifi-8000C-28.ucode" I obviously meant "iwlwifi-8000C-27.ucode". At least it was _hopefully_ obvious from that "27" in the actual version number it reports. Linus
Re: [GIT] Networking
From: Linus TorvaldsDate: Wed, 6 Sep 2017 16:27:15 -0700 > This pull request completely breaks Intel wireless for me. > > This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a). > > That remains a very standard Intel machine with absolutely zero odd > things going on. > > The firmware is iwlwifi-8000C-28.ucode from > iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports ... Johannes and other Intel folks please look into this.
Re: [GIT] Networking
This pull request completely breaks Intel wireless for me. This is my trusty old XPS 13 (9350), using Intel Wireless 8260 (rev 3a). That remains a very standard Intel machine with absolutely zero odd things going on. The firmware is iwlwifi-8000C-28.ucode from iwl7260-firmware-25.30.13.0-75.fc26.noarch, and the kernel reports iwlwifi :3a:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm the thing starts acting badly with this: iwlwifi :3a:00.0: FW Error notification: type 0x cmd_id 0x04 iwlwifi :3a:00.0: FW Error notification: seq 0x service 0x0004 iwlwifi :3a:00.0: FW Error notification: timestamp 0x5D84 iwlwifi :3a:00.0: Microcode SW error detected. Restarting 0x200. iwlwifi :3a:00.0: Start IWL Error Log Dump: iwlwifi :3a:00.0: Status: 0x0100, count: 6 iwlwifi :3a:00.0: Loaded firmware version: 27.455470.0 iwlwifi :3a:00.0: 0x0038 | BAD_COMMAND iwlwifi :3a:00.0: 0x00A002F0 | trm_hw_status0 ... iwlwifi :3a:00.0: 0x | isr status reg ieee80211 phy0: Hardware restart was requested iwlwifi :3a:00.0: FW error in SYNC CMD MAC_CONTEXT_CMD CPU: 2 PID: 993 Comm: NetworkManager Not tainted 4.13.0-06466-g80cee03bf1d6 #4 Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017 Call Trace: dump_stack+0x4d/0x70 iwl_trans_pcie_send_hcmd+0x4e7/0x530 [iwlwifi] ? wait_woken+0x80/0x80 iwl_trans_send_cmd+0x5c/0xc0 [iwlwifi] iwl_mvm_send_cmd+0x32/0x90 [iwlmvm] iwl_mvm_send_cmd_pdu+0x58/0x80 [iwlmvm] iwl_mvm_mac_ctxt_send_cmd+0x2a/0x60 [iwlmvm] ? iwl_mvm_mac_ctxt_send_cmd+0x2a/0x60 [iwlmvm] iwl_mvm_mac_ctxt_cmd_sta+0x140/0x1e0 [iwlmvm] iwl_mvm_mac_ctx_send+0x2d/0x60 [iwlmvm] iwl_mvm_mac_ctxt_add+0x43/0xc0 [iwlmvm] iwl_mvm_mac_add_interface+0x139/0x2b0 [iwlmvm] ? iwl_led_brightness_set+0x1f/0x30 [iwlmvm] drv_add_interface+0x4a/0x120 [mac80211] ieee80211_do_open+0x33d/0x820 [mac80211] ieee80211_open+0x52/0x60 [mac80211] __dev_open+0xae/0x120 __dev_change_flags+0x17b/0x1c0 dev_change_flags+0x29/0x60 do_setlink+0x2f7/0xe60 ? __nla_put+0x20/0x30 ? _raw_read_unlock_bh+0x20/0x30 ? inet6_fill_ifla6_attrs+0x4be/0x4e0 ? __kmalloc_node_track_caller+0x35/0x2b0 ? nla_parse+0x35/0x100 rtnl_newlink+0x5d2/0x8f0 ? __netlink_sendskb+0x3b/0x60 ? security_capset+0x40/0x80 ? ns_capable_common+0x68/0x80 ? ns_capable+0x13/0x20 rtnetlink_rcv_msg+0x1f9/0x280 ? rtnl_calcit.isra.26+0x110/0x110 netlink_rcv_skb+0x8e/0x130 rtnetlink_rcv+0x15/0x20 netlink_unicast+0x18b/0x220 netlink_sendmsg+0x2ad/0x3a0 sock_sendmsg+0x38/0x50 ___sys_sendmsg+0x269/0x2c0 ? addrconf_sysctl_forward+0x114/0x280 ? dev_forward_change+0x140/0x140 ? sysctl_head_finish.part.22+0x32/0x40 ? lockref_put_or_lock+0x5e/0x80 ? dput.part.22+0x13e/0x1c0 ? mntput+0x24/0x40 __sys_sendmsg+0x54/0x90 ? __sys_sendmsg+0x54/0x90 SyS_sendmsg+0x12/0x20 entry_SYSCALL_64_fastpath+0x13/0x94 RIP: 0033:0x7ff1f9933134 RSP: 002b:7ffe7419b460 EFLAGS: 0293 ORIG_RAX: 002e RAX: ffda RBX: 55604b6d82b9 RCX: 7ff1f9933134 RDX: RSI: 7ffe7419b4b0 RDI: 0007 RBP: 7ffe7419b940 R08: R09: 55604d16b400 R10: 7ff1f7cf8b38 R11: 0293 R12: 0001 R13: 0001 R14: 7ffe7419b670 R15: 55604b9515a0 iwlwifi :3a:00.0: Failed to send MAC context (action:1): -5 and it doesn't get any better from there. The next error seems to be Timeout waiting for hardware access (CSR_GP_CNTRL 0x0808) [ cut here ] WARNING: CPU: 3 PID: 1075 at drivers/net/wireless/intel/iwlwifi/pcie/trans.c:1874 iwl_trans_pcie_grab_nic_access+0xdf/0xf0 [iwlwifi] and it will continue with those microcode failure errors and various other warnigns about how nothing is working. And no, nothing works. A lot of log output, no actual network access.. Linus
Re: [GIT] Networking
(Adding linux-wireless) Pavel Machekwrites: > On Thu 2017-08-31 07:44:58, Kalle Valo wrote: >> David Miller writes: >> >> > From: Kalle Valo >> > Date: Wed, 30 Aug 2017 20:31:31 +0300 >> > >> >> AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug >> >> has been there for 7 years so waiting for a few more weeks should not >> >> hurt. >> > >> > As a maintainer you have a right to handle bug fixing in that way, but >> > certainly that is not how I would handle this. >> > >> > It's easy to validate this fix, it's extremely unlikely to cause >> > a regression, and fixes a problem someone actually was able to >> > trigger. >> > >> > Deferring to -next only has the side effect of making people wait >> > longer for the fix. >> >> Yeah, you are right there. I did actually ponder which I tree should >> commit it back in July but due to various reasons decided differently. > > Can we still get the fix to v4.13-final? :-). I'm not planning to submit pull requests to 4.13 anymore. If you think this is so important that it needs to be applied in the last minute (I don't) you could always try to convince Dave to take it directly. Or better yet, push it to the stable tree. If the merge window opens on Sunday I suspect that the commit will be in Linus' tree sometime next week. Then you can submit the request to the stable team to take it. -- Kalle Valo
Re: [GIT] Networking
On Thu 2017-08-31 07:44:58, Kalle Valo wrote: > David Millerwrites: > > > From: Kalle Valo > > Date: Wed, 30 Aug 2017 20:31:31 +0300 > > > >> AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug > >> has been there for 7 years so waiting for a few more weeks should not > >> hurt. > > > > As a maintainer you have a right to handle bug fixing in that way, but > > certainly that is not how I would handle this. > > > > It's easy to validate this fix, it's extremely unlikely to cause > > a regression, and fixes a problem someone actually was able to > > trigger. > > > > Deferring to -next only has the side effect of making people wait > > longer for the fix. > > Yeah, you are right there. I did actually ponder which I tree should > commit it back in July but due to various reasons decided differently. Can we still get the fix to v4.13-final? :-). Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [GIT] Networking
David Millerwrites: > From: Kalle Valo > Date: Wed, 30 Aug 2017 20:31:31 +0300 > >> AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug >> has been there for 7 years so waiting for a few more weeks should not >> hurt. > > As a maintainer you have a right to handle bug fixing in that way, but > certainly that is not how I would handle this. > > It's easy to validate this fix, it's extremely unlikely to cause > a regression, and fixes a problem someone actually was able to > trigger. > > Deferring to -next only has the side effect of making people wait > longer for the fix. Yeah, you are right there. I did actually ponder which I tree should commit it back in July but due to various reasons decided differently. -- Kalle Valo
Re: [GIT] Networking
From: Kalle ValoDate: Wed, 30 Aug 2017 20:31:31 +0300 > AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug > has been there for 7 years so waiting for a few more weeks should not > hurt. As a maintainer you have a right to handle bug fixing in that way, but certainly that is not how I would handle this. It's easy to validate this fix, it's extremely unlikely to cause a regression, and fixes a problem someone actually was able to trigger. Deferring to -next only has the side effect of making people wait longer for the fix.
Re: [GIT] Networking
David Millerwrites: > From: Kalle Valo > Date: Wed, 30 Aug 2017 17:45:31 +0300 > >> Pavel Machek writes: >> >>> Could we get this one in? >>> >>> wl1251 misses a spin_lock_init(). >>> >>> https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html >>> >>> It seems pretty trivial, yet getting the backtraces is not nice. >> >> It's in wireless-drivers-next and will be in 4.14-rc1: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6e9aae179f290f1a44fce7ef8e9a8e2dd68ed1e4 > > Is the bug only present in net-next? AFAICS the bug was introduced by 9df86e2e702c6 back in 2010. If the bug has been there for 7 years so waiting for a few more weeks should not hurt. And Pavel can also submit it to the stable release, it should apply without problems as wl1251 doesn't have had that many patches during the last few years (if ever). -- Kalle Valo
Re: [GIT] Networking
From: Kalle ValoDate: Wed, 30 Aug 2017 17:45:31 +0300 > Pavel Machek writes: > >> Could we get this one in? >> >> wl1251 misses a spin_lock_init(). >> >> https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html >> >> It seems pretty trivial, yet getting the backtraces is not nice. > > It's in wireless-drivers-next and will be in 4.14-rc1: > > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6e9aae179f290f1a44fce7ef8e9a8e2dd68ed1e4 Is the bug only present in net-next?
Re: [GIT] Networking
Pavel Machekwrites: > Could we get this one in? > > wl1251 misses a spin_lock_init(). > > https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html > > It seems pretty trivial, yet getting the backtraces is not nice. It's in wireless-drivers-next and will be in 4.14-rc1: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6e9aae179f290f1a44fce7ef8e9a8e2dd68ed1e4 -- Kalle Valo
Re: [GIT] Networking
Hi! Could we get this one in? wl1251 misses a spin_lock_init(). https://www.mail-archive.com/netdev@vger.kernel.org/msg177031.html It seems pretty trivial, yet getting the backtraces is not nice. Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [GIT] Networking
From: Linus TorvaldsDate: Tue, 15 Aug 2017 19:21:16 -0700 > On Tue, Aug 15, 2017 at 5:52 PM, David Miller wrote: >> >> dingtianhong (4): >> PCI: Disable PCIe Relaxed Ordering if unsupported >> PCI: Disable Relaxed Ordering for some Intel processors >> PCI: Disable Relaxed Ordering Attributes for AMD A1100 >> PCI: fix oops when try to find Root Port for a PCI device > > I would *really* have liked to see an ack on these from Bjorn Helgaas. > Was he even cc'd? > > And while singling those commits out, I would also really have liked > to see an actual name there. > > The name exists in the sign-off chain: > > Signed-off-by: Ding Tianhong > > but for some reason not in the actual commit author data, where it's > just "dingtianhong". > > Pulled, but slightly unhappy. Bjorn did review these changes, and certainly shaped the final result, but indeed I should have gotten an explicit ACK from him. I'll make sure I do so next time.
Re: [GIT] Networking
On Tue, Aug 15, 2017 at 5:52 PM, David Millerwrote: > > dingtianhong (4): > PCI: Disable PCIe Relaxed Ordering if unsupported > PCI: Disable Relaxed Ordering for some Intel processors > PCI: Disable Relaxed Ordering Attributes for AMD A1100 > PCI: fix oops when try to find Root Port for a PCI device I would *really* have liked to see an ack on these from Bjorn Helgaas. Was he even cc'd? And while singling those commits out, I would also really have liked to see an actual name there. The name exists in the sign-off chain: Signed-off-by: Ding Tianhong but for some reason not in the actual commit author data, where it's just "dingtianhong". Pulled, but slightly unhappy. Linus
Re: [GIT] Networking
On Tue, Jul 11, 2017 at 01:31:14PM -0700, David Miller wrote: > > Acked-by: David S. Miller> > Looks good, is this going via my tree or your's? I'll push it along. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [GIT] Networking
From: Herbert XuDate: Mon, 10 Jul 2017 22:00:48 +0800 > crypto: af_alg - Avoid sock_graft call warning > > The newly added sock_graft warning triggers in af_alg_accept. > It's harmless as we're essentially doing sock->sk = sock->sk. > > The sock_graft call is actually redundant because all the work > it does is subsumed by sock_init_data. However, it was added > to placate SELinux as it uses it to initialise its internal state. > > This patch avoisd the warning by making the SELinux call directly. > > Reported-by: Linus Torvalds > Signed-off-by: Herbert Xu Acked-by: David S. Miller Looks good, is this going via my tree or your's?
Re: [GIT] Networking
On Mon, Jul 10, 2017 at 08:10:02AM -0400, Sowmini Varadhan wrote: > > The reason that the WARN_ON is triggered is that af_alg_accept() calls > sock_init_data() which does > >2636 if (sock) { > : >2639 sock->sk= sk; Oh yes. This started out with just sock_init_data which would not have triggered your warning. Then someone came along and added sock_graft which basically duplicates work already done in sock_init_data. In fact the reason they wanted sock_graft was purely to call security_sock_graft to initialise some SELinux state. So we could avoid your warning by calling that directly. ---8<--- crypto: af_alg - Avoid sock_graft call warning The newly added sock_graft warning triggers in af_alg_accept. It's harmless as we're essentially doing sock->sk = sock->sk. The sock_graft call is actually redundant because all the work it does is subsumed by sock_init_data. However, it was added to placate SELinux as it uses it to initialise its internal state. This patch avoisd the warning by making the SELinux call directly. Reported-by: Linus TorvaldsSigned-off-by: Herbert Xu diff --git a/crypto/af_alg.c b/crypto/af_alg.c index 3556d8e..92a3d54 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -287,7 +287,7 @@ int af_alg_accept(struct sock *sk, struct socket *newsock, bool kern) goto unlock; sock_init_data(newsock, sk2); - sock_graft(sk2, newsock); + security_sock_graft(sk2, newsock); security_sk_clone(sk, sk2); err = type->accept(ask->private, sk2); -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [GIT] Networking
On (07/10/17 18:05), Herbert Xu wrote: > > Hmm, I can't see the problem in af_alg_accept. The struct socket > comes directly from sys_accept() which creates it using sock_alloc. > > So the only thing I can think of is that the memory returned by > sock_alloc is not zeroed and therefore the WARN_ON is just reading > garbage. Then it is odd that this WARN_ON is not triggered for other sockets (e.g., for TCP sockets), though it happens easily with AF_ALG. But it's not sock_alloc() - that function is returning a properly zeroed ->sk. The reason that the WARN_ON is triggered is that af_alg_accept() calls sock_init_data() which does 2636 if (sock) { : 2639 sock->sk= sk; So we can do one of the following: 1. drop the WARN_ON(), which makes true leaks hard to detect 2. change the WARN_ON() to WARN_ON(parent->sk && parent->sk != sk) #2 assumes that all the refcount book-keeping is being done correctly (there is the danger that we end up taking 2 refs on the sk) --Sowmini
Re: [GIT] Networking
On Sun, Jul 09, 2017 at 09:40:43PM +0100, David Miller wrote: > > > It look like PF_ALG sets up a ->sk in alg_create() (but this > > would get over-written in alg_accept()?) No it does not. The struct socket comes from sys_accept() and AFAICS it doesn't carry a struck sock with it. > > Cc'ing Herbert to see if this is expected behavior (and PF_ALG > > somehow does the right thing with the refcount for the ->sk > > set up in alg_create) in which case I suppose we should drop the > > WARN_ON. > > I think we've found yet another socket leak, this time in > AF_ALG. Hmm, I can't see the problem in af_alg_accept. The struct socket comes directly from sys_accept() which creates it using sock_alloc. So the only thing I can think of is that the memory returned by sock_alloc is not zeroed and therefore the WARN_ON is just reading garbage. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [GIT] Networking
From: Sowmini VaradhanDate: Sun, 9 Jul 2017 15:11:31 -0400 > On (07/09/17 11:49), Linus Torvalds wrote: >> >> On Sat, Jul 8, 2017 at 3:36 AM, David Miller wrote: >> > >> > 8) Fix socket leak on accept() in RDS, from Sowmini Varadhan. Also >> >add a WARN_ON() to sock_graft() so other protocol stacks don't trip >> >over this as well. >> >> Hmm. This one triggers for me on both my desktop and laptop at bootup. >> Bog-standard machines, running F25 and F24 respectively. >> >> The warning doesn't seem particularly useful, although maybe that >> "alg_accept()" gives people who know this code enough of a clue. > > My initial question was whether sock_graft() should do a sock_put() > before cutting loose the existing parent->sk and assigning a new > parent->sk (https://www.spinics.net/lists/netdev/msg442191.html) > > It look like PF_ALG sets up a ->sk in alg_create() (but this > would get over-written in alg_accept()?) > > Cc'ing Herbert to see if this is expected behavior (and PF_ALG > somehow does the right thing with the refcount for the ->sk > set up in alg_create) in which case I suppose we should drop the > WARN_ON. I think we've found yet another socket leak, this time in AF_ALG.
Re: [GIT] Networking
On (07/09/17 11:49), Linus Torvalds wrote: > > On Sat, Jul 8, 2017 at 3:36 AM, David Millerwrote: > > > > 8) Fix socket leak on accept() in RDS, from Sowmini Varadhan. Also > >add a WARN_ON() to sock_graft() so other protocol stacks don't trip > >over this as well. > > Hmm. This one triggers for me on both my desktop and laptop at bootup. > Bog-standard machines, running F25 and F24 respectively. > > The warning doesn't seem particularly useful, although maybe that > "alg_accept()" gives people who know this code enough of a clue. My initial question was whether sock_graft() should do a sock_put() before cutting loose the existing parent->sk and assigning a new parent->sk (https://www.spinics.net/lists/netdev/msg442191.html) It look like PF_ALG sets up a ->sk in alg_create() (but this would get over-written in alg_accept()?) Cc'ing Herbert to see if this is expected behavior (and PF_ALG somehow does the right thing with the refcount for the ->sk set up in alg_create) in which case I suppose we should drop the WARN_ON. --Sowmini > [ cut here ] > WARNING: CPU: 1 PID: 492 at ./include/net/sock.h:1700 > af_alg_accept+0x1bf/0x1f0 > CPU: 1 PID: 492 Comm: systemd-cryptse Not tainted 4.12.0-09010-g2b976203417c > #1 > Hardware name: System manufacturer System Product Name/Z170-K, BIOS > 1803 05/06/2016 > RIP: 0010:af_alg_accept+0x1bf/0x1f0 > Call Trace: > alg_accept+0x15/0x20 > SYSC_accept4+0x105/0x210 > ? getnstimeofday64+0xe/0x20 > ? __audit_syscall_entry+0xb1/0xf0 > ? syscall_trace_enter+0x1bd/0x2d0 > ? __audit_syscall_exit+0x1a5/0x2a0 > SyS_accept+0x10/0x20 > do_syscall_64+0x61/0x140 > entry_SYSCALL64_slow_path+0x25/0x25 > ---[ end trace a35e5baea85df269 ]---
Re: [GIT] Networking
On Sat, Jul 8, 2017 at 3:36 AM, David Millerwrote: > > 8) Fix socket leak on accept() in RDS, from Sowmini Varadhan. Also >add a WARN_ON() to sock_graft() so other protocol stacks don't trip >over this as well. Hmm. This one triggers for me on both my desktop and laptop at bootup. Bog-standard machines, running F25 and F24 respectively. The warning doesn't seem particularly useful, although maybe that "alg_accept()" gives people who know this code enough of a clue. Linus [ cut here ] WARNING: CPU: 1 PID: 492 at ./include/net/sock.h:1700 af_alg_accept+0x1bf/0x1f0 CPU: 1 PID: 492 Comm: systemd-cryptse Not tainted 4.12.0-09010-g2b976203417c #1 Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016 RIP: 0010:af_alg_accept+0x1bf/0x1f0 Call Trace: alg_accept+0x15/0x20 SYSC_accept4+0x105/0x210 ? getnstimeofday64+0xe/0x20 ? __audit_syscall_entry+0xb1/0xf0 ? syscall_trace_enter+0x1bd/0x2d0 ? __audit_syscall_exit+0x1a5/0x2a0 SyS_accept+0x10/0x20 do_syscall_64+0x61/0x140 entry_SYSCALL64_slow_path+0x25/0x25 ---[ end trace a35e5baea85df269 ]---
SIPHASH (was: Re: [GIT] Networking)
Hi Jason, On Wed, Feb 22, 2017 at 5:31 AM, David Millerwrote: > 3) Introduce SIPHASH and it's usage for secure sequence numbers and >syncookies. From Jason A. Donenfeld. > Jason A. Donenfeld (4): > siphash: add cryptographically secure PRF > siphash: implement HalfSipHash1-3 for hash tables > secure_seq: use SipHash in place of MD5 > syncookies: use SipHash in place of SHA1 bloat-o-meter v4.10.. says: add/remove: 338/127 grow/shrink: 604/310 up/down: 86156/-24117 (62039) ... siphash_4u64 -5006 +5006 siphash_3u64 -4298 +4298 siphash_2u64 -3582 +3582 __siphash_unaligned-3052 +3052 __siphash_aligned -3052 +3052 siphash_3u32 -2976 +2976 siphash_1u64 -2870 +2870 siphash_1u32 -2172 +2172 ... Do we need all of this builtin, unconditionally? Thanks for your answer! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [GIT] Networking
From: David MillerDate: Wed, 15 Feb 2017 22:26:56 -0500 (EST) > From: Linus Torvalds > Date: Wed, 15 Feb 2017 18:01:24 -0800 > >> So David, you really need to convince me that this pull is truly >> required. > > I'll revert those changes, give me a second. Ok, pull from my tree again, there will one new commit that reverts the nested rhashtable changes. git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git Thanks!
Re: [GIT] Networking
From: Linus TorvaldsDate: Wed, 15 Feb 2017 18:01:24 -0800 > So David, you really need to convince me that this pull is truly > required. I'll revert those changes, give me a second.
Re: [GIT] Networking
On Wed, Feb 15, 2017 at 5:31 PM, David Millerwrote: > > 3) More gracefully handle rhashtable insertion errors when vmalloc is >not possible, from Herbert Xu. Ugh. So I pulled this, but when I look at his code, I'm really not sure that I should have, and I haven't pushed the result out. I'm pretty sure I will unpull. This code looks like it could *easily* have subtle issues, and looks like the kind of thing that should go in a merge window. The merge commit has this re-assuring sentence in it from Herbert: "I've tested this with test_rhashtable and it appears to work." and this is during the very last week of an rc release. None of the commits even talk about why this was needed in the first place, so there's no indication that this new scary code was really so high-priority that it absolutely *has* to go into a late rc release. None of it is marked for stable either, since networking never does that. So David, you really need to convince me that this pull is truly required. Linus
Re: [GIT] Networking
Thanks. Pulled, going through my usual allmodconfig test-build before being pushed out, Linus On Wed, Jan 11, 2017 at 7:22 AM, David Millerwrote: > > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
Re: [GIT] Networking
David Millerwrites: > From: Linus Torvalds > Date: Mon, 9 Jan 2017 12:08:02 -0800 > >> On Sun, Jan 8, 2017 at 7:38 PM, David Miller wrote: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net >> >> Hmm. This still doesn't contain the rtlwifi oops fix that was posted >> back before christmas. >> >> Kalle said it was applied to the wireless-drivers tree as commit >> 60f59ce02785 in his tree, but it's never gotten to me. >> >> What's up? It's three weeks later, and people are hitting the bug. > > I haven't received a bug fix pull from Kalle since Dec 22nd: > > http://patchwork.ozlabs.org/patch/708247/ > > That one has: > > rtlwifi: Fix kernel oops introduced with commit e49656147359 > > Is that what you're talking about? > > In the thread I said I pulled it but indeed I don't have that commit > in my tree either, weird. > > I just tried to pull that git URL right now and I get nothing: > > > git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git > tags/wireless-drivers-for-davem-2016-12-22 > > It looks like the rtlwifi fix is in his "pending" branch. Be careful, there are two recent rtlwifi fixes in wireless-drivers: 60f59ce02785 rtlwifi: rtl_usb: Fix missing entry in USB driver's private data 22b68b93ae25 rtlwifi: Fix kernel oops introduced with commit e49656147359 You should have 22b68b93ae25 in your tree, but not 60f59ce02785. The reason why you didn't get 60f59ce02785 in my pull request was that I committed it late: commit 60f59ce0278557f7896d5158ae6d12a4855a72cc Author: Larry Finger AuthorDate: Wed Dec 21 11:18:55 2016 -0600 Commit: Kalle Valo CommitDate: Fri Dec 30 15:38:13 2016 +0200 I'll send you a new pull request tomorrow (my time). -- Kalle Valo
Re: [GIT] Networking
Linus Torvaldswrites: > On Sun, Jan 8, 2017 at 7:38 PM, David Miller wrote: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net > > Hmm. This still doesn't contain the rtlwifi oops fix that was posted > back before christmas. > > Kalle said it was applied to the wireless-drivers tree as commit > 60f59ce02785 in his tree, but it's never gotten to me. > > What's up? It's three weeks later, and people are hitting the bug. You mean this one: 60f59ce02785 rtlwifi: rtl_usb: Fix missing entry in USB driver's private data I accidentally missed the patch during my holiday pull request, committed the patch late, and I haven't sent a new pull request since. Also the commit log didn't mention anything about an oops, only about "missing an entry", so I didn't realise it was a critical fix. Sorry about that, I'll send a new pull request to Dave first thing tomorrow. It's too late now for me. -- Kalle Valo
Re: [GIT] Networking
From: Linus TorvaldsDate: Mon, 9 Jan 2017 12:08:02 -0800 > On Sun, Jan 8, 2017 at 7:38 PM, David Miller wrote: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net > > Hmm. This still doesn't contain the rtlwifi oops fix that was posted > back before christmas. > > Kalle said it was applied to the wireless-drivers tree as commit > 60f59ce02785 in his tree, but it's never gotten to me. > > What's up? It's three weeks later, and people are hitting the bug. I haven't received a bug fix pull from Kalle since Dec 22nd: http://patchwork.ozlabs.org/patch/708247/ That one has: rtlwifi: Fix kernel oops introduced with commit e49656147359 Is that what you're talking about? In the thread I said I pulled it but indeed I don't have that commit in my tree either, weird. I just tried to pull that git URL right now and I get nothing: git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2016-12-22 It looks like the rtlwifi fix is in his "pending" branch.
Re: [GIT] Networking
On Sun, Jan 8, 2017 at 7:38 PM, David Millerwrote: > > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Hmm. This still doesn't contain the rtlwifi oops fix that was posted back before christmas. Kalle said it was applied to the wireless-drivers tree as commit 60f59ce02785 in his tree, but it's never gotten to me. What's up? It's three weeks later, and people are hitting the bug. Linus
Re: [GIT] Networking
Hi Dave, On Wed, 05 Oct 2016 22:56:12 -0400 (EDT) David Millerwrote: > > Yes, this is where the change got lost. No worries. > I have all of the fixups queued up in my net tree and will send in a pull > request later. Thanks. -- Cheers, Stephen Rothwell
Re: [GIT] Networking
From: Stephen RothwellDate: Thu, 6 Oct 2016 13:51:52 +1100 > On Wed, 5 Oct 2016 19:14:21 -0700 Linus Torvalds > wrote: >> >> On Wed, Oct 5, 2016 at 5:52 PM, Stephen Rothwell >> wrote: >> > >> > Except that commit effectively moved that function from >> > net/netfilter/nf_tables_netdev.c to >> > include/net/netfilter/nf_tables_ipv4.h while commit c73c24849011 >> > ("netfilter: nf_tables_netdev: remove redundant ip_hdr assignment") >> > removed the assignment in the original file (and has been in your tree >> > since v4.8-rc7) and that is where I originally actually got a conflict. >> >> Oh, interesting. Why didn't I get the conflict there then? >> >> I'm guessing (but too lazy to actually look up the history), that >> David ended up doing that merge and that ends up being why I never saw >> a conflict. > > Yeah, commit b50afd203a5e ("Merge > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") actually > merges v4.8 into the net-next tree. Yes, this is where the change got lost. I have all of the fixups queued up in my net tree and will send in a pull request later.
Re: [GIT] Networking
Hi Linus, On Wed, 5 Oct 2016 19:14:21 -0700 Linus Torvaldswrote: > > On Wed, Oct 5, 2016 at 5:52 PM, Stephen Rothwell > wrote: > > > > Except that commit effectively moved that function from > > net/netfilter/nf_tables_netdev.c to > > include/net/netfilter/nf_tables_ipv4.h while commit c73c24849011 > > ("netfilter: nf_tables_netdev: remove redundant ip_hdr assignment") > > removed the assignment in the original file (and has been in your tree > > since v4.8-rc7) and that is where I originally actually got a conflict. > > Oh, interesting. Why didn't I get the conflict there then? > > I'm guessing (but too lazy to actually look up the history), that > David ended up doing that merge and that ends up being why I never saw > a conflict. Yeah, commit b50afd203a5e ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") actually merges v4.8 into the net-next tree. -- Cheers, Stephen Rothwell
Re: [GIT] Networking
On Wed, Oct 5, 2016 at 5:52 PM, Stephen Rothwellwrote: > > Except that commit effectively moved that function from > net/netfilter/nf_tables_netdev.c to > include/net/netfilter/nf_tables_ipv4.h while commit c73c24849011 > ("netfilter: nf_tables_netdev: remove redundant ip_hdr assignment") > removed the assignment in the original file (and has been in your tree > since v4.8-rc7) and that is where I originally actually got a conflict. Oh, interesting. Why didn't I get the conflict there then? I'm guessing (but too lazy to actually look up the history), that David ended up doing that merge and that ends up being why I never saw a conflict. Linus
Re: [GIT] Networking
Hi Linus, On Wed, 5 Oct 2016 15:37:17 -0700 Linus Torvaldswrote: > > On Wed, Oct 5, 2016 at 3:29 PM, Stephen Rothwell > wrote: > > > > I have been carrying the following merge fix patch (for the merge of > > the net-next tree with Linus' tree) for a while now which seems to have > > got missed: > > Ugh. It doesn't seem to be a merge error, because that double iph > assignment came from the original patch that introduced this function: > commit ddc8b6027ad0 ("netfilter: introduce nft_set_pktinfo_{ipv4, > ipv6}_validate()"). Except that commit effectively moved that function from net/netfilter/nf_tables_netdev.c to include/net/netfilter/nf_tables_ipv4.h while commit c73c24849011 ("netfilter: nf_tables_netdev: remove redundant ip_hdr assignment") removed the assignment in the original file (and has been in your tree since v4.8-rc7) and that is where I originally actually got a conflict. > So I wouldn't call it a merge error - it just looks like a bug in the > network layer. So I'm not going to apply your patch even though it > looks plausible to me, simply because it's outside my area of > expertise. no worries. -- Cheers, Stephen Rothwell
Re: [GIT] Networking
From: Pablo Neira AyusoDate: Thu, 6 Oct 2016 02:09:45 +0200 > On Wed, Oct 05, 2016 at 03:37:17PM -0700, Linus Torvalds wrote: >> On Wed, Oct 5, 2016 at 3:29 PM, Stephen Rothwell >> wrote: >> > >> > I have been carrying the following merge fix patch (for the merge of >> > the net-next tree with Linus' tree) for a while now which seems to have >> > got missed: >> >> Ugh. It doesn't seem to be a merge error, because that double iph >> assignment came from the original patch that introduced this function: >> commit ddc8b6027ad0 ("netfilter: introduce nft_set_pktinfo_{ipv4, >> ipv6}_validate()"). >> >> So I wouldn't call it a merge error - it just looks like a bug in the >> network layer. So I'm not going to apply your patch even though it >> looks plausible to me, simply because it's outside my area of >> expertise. >> >> David? Pablo? > > This looks good, please take it so we speed up things. > > Acked-by: Pablo Neira Ayuso Applied.
Re: [GIT] Networking
On Wed, Oct 05, 2016 at 03:37:17PM -0700, Linus Torvalds wrote: > On Wed, Oct 5, 2016 at 3:29 PM, Stephen Rothwell> wrote: > > > > I have been carrying the following merge fix patch (for the merge of > > the net-next tree with Linus' tree) for a while now which seems to have > > got missed: > > Ugh. It doesn't seem to be a merge error, because that double iph > assignment came from the original patch that introduced this function: > commit ddc8b6027ad0 ("netfilter: introduce nft_set_pktinfo_{ipv4, > ipv6}_validate()"). > > So I wouldn't call it a merge error - it just looks like a bug in the > network layer. So I'm not going to apply your patch even though it > looks plausible to me, simply because it's outside my area of > expertise. > > David? Pablo? This looks good, please take it so we speed up things. Acked-by: Pablo Neira Ayuso Thanks! P.S: Sorry for not addressing this any sooner, traveling overhead, conferente and unstable wifi connection has been a problem here.
Re: [GIT] Networking
On Wed, Oct 5, 2016 at 3:29 PM, Stephen Rothwellwrote: > > I have been carrying the following merge fix patch (for the merge of > the net-next tree with Linus' tree) for a while now which seems to have > got missed: Ugh. It doesn't seem to be a merge error, because that double iph assignment came from the original patch that introduced this function: commit ddc8b6027ad0 ("netfilter: introduce nft_set_pktinfo_{ipv4, ipv6}_validate()"). So I wouldn't call it a merge error - it just looks like a bug in the network layer. So I'm not going to apply your patch even though it looks plausible to me, simply because it's outside my area of expertise. David? Pablo? Linus
Re: [GIT] Networking
Hi Linus, Dave, On Wed, 05 Oct 2016 01:44:37 -0400 (EDT) David Millerwrote: > I have been carrying the following merge fix patch (for the merge of the net-next tree with Linus' tree) for a while now which seems to have got missed: From: Stephen Rothwell Date: Tue, 13 Sep 2016 10:08:58 +1000 Subject: [PATCH] netfilter: merge fixup for "nf_tables_netdev: remove redundant ip_hdr assignment" Signed-off-by: Stephen Rothwell --- include/net/netfilter/nf_tables_ipv4.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/net/netfilter/nf_tables_ipv4.h b/include/net/netfilter/nf_tables_ipv4.h index 968f00b82fb5..25e33aee91e7 100644 --- a/include/net/netfilter/nf_tables_ipv4.h +++ b/include/net/netfilter/nf_tables_ipv4.h @@ -33,7 +33,6 @@ __nft_set_pktinfo_ipv4_validate(struct nft_pktinfo *pkt, if (!iph) return -1; - iph = ip_hdr(skb); if (iph->ihl < 5 || iph->version != 4) return -1; -- 2.8.1 -- Cheers, Stephen Rothwell
Re: [GIT] Networking
On 5/19/16, Reinoud Koornstrawrote: > On Thu, May 19, 2016 at 2:20 AM, Reinoud Koornstra > wrote: >> On Wed, May 18, 2016 at 12:51 PM, Linus Torvalds >> wrote: >>> On Wed, May 18, 2016 at 11:45 AM, Linus Torvalds >>> wrote: From what I can tell, there's a merge bug in commit 909b27f70643, where David seems to have lost some of the changes to iwl_mvm_set_tx_cmd(). I do not know if that's the reason for the problem I see. But I will test. >>> >>> Yes. The attached patch that fixes the incorrect merge seems to fix >>> things for me. >>> >>> That should mean that the assumption that this problem existed in v4.6 >>> too was wrong, because the incorrect merge came in later. I think >>> Luciano mis-understood "v4.6+" to mean plain v4.6. >>> >>> Reinoud Koornstra, does this patch fix things for you too? >> >> Indeed, I meant 4.6+, not 4.6. >> The patch you attached doesn't change existing code for me in 4.6+ as >> these two lines are already in there. >> Thanks, >> >> Reinoud. >> > > In the 4.6+ code from today I reverted commit 5c08b0f5026f. > Now iwlwifi works fine for me again. > So it's as the Intel guys suspected. > I'll attached my revert compared to the current 4.6+ development code. > Thanks, > I am not affected by any of these iwlwifi issues. Can you next time add the commit-subject together with the commit-id, like this... commit 5c08b0f5026fcc13efb947c4d1f2ca3558145f68 "iwlwifi: mvm: don't override the rate with the AMSDU len" ...it's easier for followers of this thread. - Sedat - [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5c08b0f5026fcc13efb947c4d1f2ca3558145f68
Re: [GIT] Networking
On Thu, May 19, 2016 at 2:20 AM, Reinoud Koornstrawrote: > On Wed, May 18, 2016 at 12:51 PM, Linus Torvalds > wrote: >> On Wed, May 18, 2016 at 11:45 AM, Linus Torvalds >> wrote: >>> >>> From what I can tell, there's a merge bug in commit 909b27f70643, >>> where David seems to have lost some of the changes to >>> iwl_mvm_set_tx_cmd(). >>> >>> I do not know if that's the reason for the problem I see. But I will test. >> >> Yes. The attached patch that fixes the incorrect merge seems to fix >> things for me. >> >> That should mean that the assumption that this problem existed in v4.6 >> too was wrong, because the incorrect merge came in later. I think >> Luciano mis-understood "v4.6+" to mean plain v4.6. >> >> Reinoud Koornstra, does this patch fix things for you too? > > Indeed, I meant 4.6+, not 4.6. > The patch you attached doesn't change existing code for me in 4.6+ as > these two lines are already in there. > Thanks, > > Reinoud. > In the 4.6+ code from today I reverted commit 5c08b0f5026f. Now iwlwifi works fine for me again. So it's as the Intel guys suspected. I'll attached my revert compared to the current 4.6+ development code. Thanks, Reinoud. >> >>Linus diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index c53aa0f..00c9298 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -211,7 +211,6 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, struct iwl_tx_cmd *tx_cmd, struct ieee80211_tx_info *info, u8 sta_id) { - struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); struct ieee80211_hdr *hdr = (void *)skb->data; __le16 fc = hdr->frame_control; u32 tx_flags = le32_to_cpu(tx_cmd->tx_flags); @@ -295,7 +294,7 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ tx_cmd->len = cpu_to_le16((u16)skb->len + - (uintptr_t)skb_info->driver_data[0]); + (uintptr_t)info->driver_data[0]); tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id; @@ -443,11 +442,10 @@ static void iwl_mvm_set_tx_cmd_crypto(struct iwl_mvm *mvm, */ static struct iwl_device_cmd * iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, - struct ieee80211_tx_info *info, int hdrlen, - struct ieee80211_sta *sta, u8 sta_id) + int hdrlen, struct ieee80211_sta *sta, u8 sta_id) { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; - struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); struct iwl_device_cmd *dev_cmd; struct iwl_tx_cmd *tx_cmd; @@ -467,10 +465,10 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, iwl_mvm_set_tx_cmd_rate(mvm, tx_cmd, info, sta, hdr->frame_control); - memset(_info->status, 0, sizeof(skb_info->status)); - memset(skb_info->driver_data, 0, sizeof(skb_info->driver_data)); + memset(>status, 0, sizeof(info->status)); + memset(info->driver_data, 0, sizeof(info->driver_data)); - skb_info->driver_data[1] = dev_cmd; + info->driver_data[1] = dev_cmd; return dev_cmd; } @@ -478,25 +476,22 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; - struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); - struct ieee80211_tx_info info; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); struct iwl_device_cmd *dev_cmd; struct iwl_tx_cmd *tx_cmd; u8 sta_id; int hdrlen = ieee80211_hdrlen(hdr->frame_control); - memcpy(, skb->cb, sizeof(info)); - - if (WARN_ON_ONCE(info.flags & IEEE80211_TX_CTL_AMPDU)) + if (WARN_ON_ONCE(info->flags & IEEE80211_TX_CTL_AMPDU)) return -1; - if (WARN_ON_ONCE(info.flags & IEEE80211_TX_CTL_SEND_AFTER_DTIM && -(!info.control.vif || - info.hw_queue != info.control.vif->cab_queue))) + if (WARN_ON_ONCE(info->flags & IEEE80211_TX_CTL_SEND_AFTER_DTIM && +(!info->control.vif || + info->hw_queue != info->control.vif->cab_queue))) return -1; /* This holds the amsdu headers length */ - skb_info->driver_data[0] = (void *)(uintptr_t)0; + info->driver_data[0] = (void *)(uintptr_t)0; /* * IWL_MVM_OFFCHANNEL_QUEUE is used for
Re: [GIT] Networking
On Wed, May 18, 2016 at 12:51 PM, Linus Torvaldswrote: > On Wed, May 18, 2016 at 11:45 AM, Linus Torvalds > wrote: >> >> From what I can tell, there's a merge bug in commit 909b27f70643, >> where David seems to have lost some of the changes to >> iwl_mvm_set_tx_cmd(). >> >> I do not know if that's the reason for the problem I see. But I will test. > > Yes. The attached patch that fixes the incorrect merge seems to fix > things for me. > > That should mean that the assumption that this problem existed in v4.6 > too was wrong, because the incorrect merge came in later. I think > Luciano mis-understood "v4.6+" to mean plain v4.6. > > Reinoud Koornstra, does this patch fix things for you too? Indeed, I meant 4.6+, not 4.6. The patch you attached doesn't change existing code for me in 4.6+ as these two lines are already in there. Thanks, Reinoud. > >Linus
Re: [GIT] Networking
From: Linus TorvaldsDate: Wed, 18 May 2016 11:45:06 -0700 > David, do you happen to recall that merge conflict? I think you must > have removed that "skb_info" variable declaration and initialization > manually (due to the "unused variable" warning, which in turn was due > to the incorrect merge of the actual conflict), because I think git > would have merged that line into the result. Yes, I know I buggered this merge conflict and Kalle said he'd have a fix coming my way ASAP. Sorry, I was travelling today, so I'll catch up with this tomorrow.
Re: [GIT] Networking
Linus Torvaldswrites: > On Wed, May 18, 2016 at 11:58 AM, Kalle Valo wrote: >> >> It would be best if you could send a patch either directly to Dave or >> Linus to resolve this quickly. > > I'm committing my patch myself right now, since this bug makes my > laptop useless, and I will take credit for finding and testing it on > my own Kiitti :) > even if it was apparently also discussed independently on the > networking list ;) Yeah, sorry about taking this too long. -- Kalle Valo
Re: [GIT] Networking
On Wed, 2016-05-18 at 12:00 -0700, Linus Torvalds wrote: > On Wed, May 18, 2016 at 11:58 AM, Kalle Valo> wrote: > > > > > > It would be best if you could send a patch either directly to Dave > > or > > Linus to resolve this quickly. > I'm committing my patch myself right now, since this bug makes my > laptop useless, and I will take credit for finding and testing it on > my own even if it was apparently also discussed independently on the > networking list ;) Great! :) You beat me by a few minutes, even though I had the whole day to play with it. :\ -- Cheers, Luca.
Re: [GIT] Networking
On Wed, May 18, 2016 at 11:58 AM, Kalle Valowrote: > > It would be best if you could send a patch either directly to Dave or > Linus to resolve this quickly. I'm committing my patch myself right now, since this bug makes my laptop useless, and I will take credit for finding and testing it on my own even if it was apparently also discussed independently on the networking list ;) Linus
Re: [GIT] Networking
"Coelho, Luciano"writes: > Kalle, David, what is the status with the fix that is on the way via > your trees? It would be best if you could send a patch either directly to Dave or Linus to resolve this quickly. -- Kalle Valo
Re: [GIT] Networking
On Wed, May 18, 2016 at 11:45 AM, Linus Torvaldswrote: > > From what I can tell, there's a merge bug in commit 909b27f70643, > where David seems to have lost some of the changes to > iwl_mvm_set_tx_cmd(). > > I do not know if that's the reason for the problem I see. But I will test. Yes. The attached patch that fixes the incorrect merge seems to fix things for me. That should mean that the assumption that this problem existed in v4.6 too was wrong, because the incorrect merge came in later. I think Luciano mis-understood "v4.6+" to mean plain v4.6. Reinoud Koornstra, does this patch fix things for you too? Linus drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 880210917a6f..c53aa0f220e0 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -211,6 +211,7 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, struct iwl_tx_cmd *tx_cmd, struct ieee80211_tx_info *info, u8 sta_id) { + struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); struct ieee80211_hdr *hdr = (void *)skb->data; __le16 fc = hdr->frame_control; u32 tx_flags = le32_to_cpu(tx_cmd->tx_flags); @@ -294,7 +295,7 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ tx_cmd->len = cpu_to_le16((u16)skb->len + - (uintptr_t)info->driver_data[0]); + (uintptr_t)skb_info->driver_data[0]); tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id;
Re: [GIT] Networking
On Wed, 2016-05-18 at 11:45 -0700, Linus Torvalds wrote: > On Wed, May 18, 2016 at 7:23 AM, Coelho, Luciano >wrote: > > > > > > I can confirm that 4.6 contains the same bug. And reverting the > > patch > > I mentioned does solve the problem... > > > > The same patch works fine in our internal tree. I'll have to > > figure > > out together with Emmanuel what the problem actually is. > Hmm. > > From what I can tell, there's a merge bug in commit 909b27f70643, > where David seems to have lost some of the changes to > iwl_mvm_set_tx_cmd(). > > The reason seems to be a conflict with d8fe484470dd, where David > missed the fact that "info->driver_data[0]" had become > "skb_info->driver_data[0]", and then he removed the skb_info because > it was unused. > > I do not know if that's the reason for the problem I see. But I will > test. > > David, do you happen to recall that merge conflict? I think you must > have removed that "skb_info" variable declaration and initialization > manually (due to the "unused variable" warning, which in turn was due > to the incorrect merge of the actual conflict), because I think git > would have merged that line into the result. Actually I just tested it and indeed it seems to be the merge damage (which we discussed extensively in the linux-wireless mailing list) that causes this problem. The "4.6 doesn't work either" thing was a false alarm. If the merge damage is fixed this way, the problem is gone: diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index b5f7c36..ae2ecf6 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -211,6 +211,7 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, struct iwl_tx_cmd *tx_cmd, struct ieee80211_tx_info *info, u8 sta_id) { + struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); struct ieee80211_hdr *hdr = (void *)skb->data; __le16 fc = hdr->frame_control; u32 tx_flags = le32_to_cpu(tx_cmd->tx_flags); @@ -294,7 +295,7 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ tx_cmd->len = cpu_to_le16((u16)skb->len + - (uintptr_t)info->driver_data[0]); + (uintptr_t)skb_info->driver_data[0]); tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id; Kalle, David, what is the status with the fix that is on the way via your trees? -- Cheers, Luca.
Re: [GIT] Networking
On Wed, May 18, 2016 at 7:23 AM, Coelho, Lucianowrote: > > I can confirm that 4.6 contains the same bug. And reverting the patch > I mentioned does solve the problem... > > The same patch works fine in our internal tree. I'll have to figure > out together with Emmanuel what the problem actually is. Hmm. >From what I can tell, there's a merge bug in commit 909b27f70643, where David seems to have lost some of the changes to iwl_mvm_set_tx_cmd(). The reason seems to be a conflict with d8fe484470dd, where David missed the fact that "info->driver_data[0]" had become "skb_info->driver_data[0]", and then he removed the skb_info because it was unused. I do not know if that's the reason for the problem I see. But I will test. David, do you happen to recall that merge conflict? I think you must have removed that "skb_info" variable declaration and initialization manually (due to the "unused variable" warning, which in turn was due to the incorrect merge of the actual conflict), because I think git would have merged that line into the result. Linus
Re: [GIT] Networking
On Wed, 2016-05-18 at 06:51 -0600, Reinoud Koornstra wrote: > On Wed, May 18, 2016 at 6:41 AM, Coelho, Luciano >wrote: > > > > On Wed, 2016-05-18 at 06:20 -0600, Reinoud Koornstra wrote: > > > > > > On Wed, May 18, 2016 at 4:51 AM, Coelho, Luciano > > > wrote: > > > > > > > > > > > > Hi Emmanuel, Linus, > > > > > > > > > > > > On Wed, 2016-05-18 at 06:37 +0300, Emmanuel Grumbach wrote: > > > > > > > > > > > > > > > On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Tue, May 17, 2016 at 12:11 PM, David Miller > > > > > loft > > > > > > .net > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Highlights: > > > > > > Lowlights: > > > > > > > > > > > > 1) the iwlwifi driver seems to be broken > > > > > > > > > > > > My laptop that uses the intel 7680 iwlwifi module no longer > > > > > > connects > > > > > > to the network. It fails with a "Microcode SW error > > > > > > detected." > > > > > > and > > > > > > spews out register state over and over again. > > > > > Can we have the register state and the ASSERT / NMI / > > > > > whatever > > > > > that > > > > > goes along with it? > > > > > This clearly means that the firmware is crashing, but I don't > > > > > know > > > > > why, > > > > > I copied here the lines that I need from another bug with > > > > > another > > > > > device with another firmware, > > > > > but the log that we will still explain what I need: > > > > I managed to reproduce this bug locally with Linus' > > > > master. I'm > > > > investigating the cause and I'll let you how it goes. > > > I did run the latest git code as well 4.6+ > > > iwlwifi went pearshape in my case as well. > > > I just updated the microcode as well, it didn't matter. > > > 4.6-rc7 works fine and no errors are reported with iwlwifi. > > > > > > Here's output that might come in handy > > Thanks! That is helpful. > > > > Since you say that 4.6-rc7 works but 4.6 doesn't, the prime suspect > > is > > commit 5c08b0f5026f ("iwlwifi: mvm: don't override the rate with > > the > > AMSDU len"), which is the only iwlwifi patch between those two > > releases. > > > > Could you try to revert that and see if the error is gone? > Will do, since git revert failed I'll revert manually and report > back. I can confirm that 4.6 contains the same bug. And reverting the patch I mentioned does solve the problem... The same patch works fine in our internal tree. I'll have to figure out together with Emmanuel what the problem actually is. -- Luca.
Re: [GIT] Networking
On Wed, May 18, 2016 at 6:41 AM, Coelho, Lucianowrote: > On Wed, 2016-05-18 at 06:20 -0600, Reinoud Koornstra wrote: >> On Wed, May 18, 2016 at 4:51 AM, Coelho, Luciano >> wrote: >> > >> > Hi Emmanuel, Linus, >> > >> > >> > On Wed, 2016-05-18 at 06:37 +0300, Emmanuel Grumbach wrote: >> > > >> > > On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds >> > > wrote: >> > > > >> > > > >> > > > On Tue, May 17, 2016 at 12:11 PM, David Miller > > > > .net >> > > > > >> > > > > wrote: >> > > > > >> > > > > >> > > > > Highlights: >> > > > Lowlights: >> > > > >> > > > 1) the iwlwifi driver seems to be broken >> > > > >> > > > My laptop that uses the intel 7680 iwlwifi module no longer >> > > > connects >> > > > to the network. It fails with a "Microcode SW error detected." >> > > > and >> > > > spews out register state over and over again. >> > > Can we have the register state and the ASSERT / NMI / whatever >> > > that >> > > goes along with it? >> > > This clearly means that the firmware is crashing, but I don't >> > > know >> > > why, >> > > I copied here the lines that I need from another bug with another >> > > device with another firmware, >> > > but the log that we will still explain what I need: >> > I managed to reproduce this bug locally with Linus' master. I'm >> > investigating the cause and I'll let you how it goes. >> I did run the latest git code as well 4.6+ >> iwlwifi went pearshape in my case as well. >> I just updated the microcode as well, it didn't matter. >> 4.6-rc7 works fine and no errors are reported with iwlwifi. >> >> Here's output that might come in handy > > Thanks! That is helpful. > > Since you say that 4.6-rc7 works but 4.6 doesn't, the prime suspect is > commit 5c08b0f5026f ("iwlwifi: mvm: don't override the rate with the > AMSDU len"), which is the only iwlwifi patch between those two > releases. > > Could you try to revert that and see if the error is gone? Will do, since git revert failed I'll revert manually and report back. > > -- > Cheers, > Luca.
Re: [GIT] Networking
On Wed, 2016-05-18 at 06:20 -0600, Reinoud Koornstra wrote: > On Wed, May 18, 2016 at 4:51 AM, Coelho, Luciano >wrote: > > > > Hi Emmanuel, Linus, > > > > > > On Wed, 2016-05-18 at 06:37 +0300, Emmanuel Grumbach wrote: > > > > > > On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds > > > wrote: > > > > > > > > > > > > On Tue, May 17, 2016 at 12:11 PM, David Miller > > > .net > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Highlights: > > > > Lowlights: > > > > > > > > 1) the iwlwifi driver seems to be broken > > > > > > > > My laptop that uses the intel 7680 iwlwifi module no longer > > > > connects > > > > to the network. It fails with a "Microcode SW error detected." > > > > and > > > > spews out register state over and over again. > > > Can we have the register state and the ASSERT / NMI / whatever > > > that > > > goes along with it? > > > This clearly means that the firmware is crashing, but I don't > > > know > > > why, > > > I copied here the lines that I need from another bug with another > > > device with another firmware, > > > but the log that we will still explain what I need: > > I managed to reproduce this bug locally with Linus' master. I'm > > investigating the cause and I'll let you how it goes. > I did run the latest git code as well 4.6+ > iwlwifi went pearshape in my case as well. > I just updated the microcode as well, it didn't matter. > 4.6-rc7 works fine and no errors are reported with iwlwifi. > > Here's output that might come in handy Thanks! That is helpful. Since you say that 4.6-rc7 works but 4.6 doesn't, the prime suspect is commit 5c08b0f5026f ("iwlwifi: mvm: don't override the rate with the AMSDU len"), which is the only iwlwifi patch between those two releases. Could you try to revert that and see if the error is gone? -- Cheers, Luca.
Re: [GIT] Networking
On Wed, May 18, 2016 at 4:51 AM, Coelho, Lucianowrote: > Hi Emmanuel, Linus, > > > On Wed, 2016-05-18 at 06:37 +0300, Emmanuel Grumbach wrote: >> On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds >> wrote: >> > >> > On Tue, May 17, 2016 at 12:11 PM, David Miller > > > wrote: >> > > >> > > >> > > Highlights: >> > Lowlights: >> > >> > 1) the iwlwifi driver seems to be broken >> > >> > My laptop that uses the intel 7680 iwlwifi module no longer >> > connects >> > to the network. It fails with a "Microcode SW error detected." and >> > spews out register state over and over again. >> Can we have the register state and the ASSERT / NMI / whatever that >> goes along with it? >> This clearly means that the firmware is crashing, but I don't know >> why, >> I copied here the lines that I need from another bug with another >> device with another firmware, >> but the log that we will still explain what I need: > > I managed to reproduce this bug locally with Linus' master. I'm > investigating the cause and I'll let you how it goes. I did run the latest git code as well 4.6+ iwlwifi went pearshape in my case as well. I just updated the microcode as well, it didn't matter. 4.6-rc7 works fine and no errors are reported with iwlwifi. Here's output that might come in handy [ 17.436340] iwlwifi :04:00.0: loaded firmware version 16.242414.0 op_mode iwlmvm [ 17.714920] iwlwifi :04:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0x144 SNIP [ 114.837923] wlp4s0: authenticate with 00:30:44:1d:cf:2b [ 114.841365] wlp4s0: send auth to 00:30:44:1d:cf:2b (try 1/3) [ 114.842073] wlp4s0: send auth to 00:30:44:1d:cf:2b (try 2/3) [ 115.041992] iwlwifi :04:00.0: Microcode SW error detected. Restarting 0x200. [ 115.041995] iwlwifi :04:00.0: CSR values: [ 115.041996] iwlwifi :04:00.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG) [ 115.042000] iwlwifi :04:00.0:CSR_HW_IF_CONFIG_REG: 0X40489204 [ 115.042003] iwlwifi :04:00.0: CSR_INT_COALESCING: 0X8040 [ 115.042006] iwlwifi :04:00.0: CSR_INT: 0X [ 115.042009] iwlwifi :04:00.0:CSR_INT_MASK: 0X [ 115.042013] iwlwifi :04:00.0: CSR_FH_INT_STATUS: 0X [ 115.042016] iwlwifi :04:00.0: CSR_GPIO_IN: 0X [ 115.042019] iwlwifi :04:00.0: CSR_RESET: 0X [ 115.042022] iwlwifi :04:00.0:CSR_GP_CNTRL: 0X080403c5 [ 115.042026] iwlwifi :04:00.0: CSR_HW_REV: 0X0144 [ 115.042029] iwlwifi :04:00.0: CSR_EEPROM_REG: 0X [ 115.042032] iwlwifi :04:00.0: CSR_EEPROM_GP: 0X8000 [ 115.042035] iwlwifi :04:00.0: CSR_OTP_GP_REG: 0X803a [ 115.042038] iwlwifi :04:00.0: CSR_GIO_REG: 0X001f0044 [ 115.042042] iwlwifi :04:00.0:CSR_GP_UCODE_REG: 0X [ 115.042045] iwlwifi :04:00.0: CSR_GP_DRIVER_REG: 0X [ 115.042048] iwlwifi :04:00.0: CSR_UCODE_DRV_GP1: 0X [ 115.042051] iwlwifi :04:00.0: CSR_UCODE_DRV_GP2: 0X [ 115.042054] iwlwifi :04:00.0: CSR_LED_REG: 0X0060 [ 115.042058] iwlwifi :04:00.0:CSR_DRAM_INT_TBL_REG: 0X88035a74 [ 115.042061] iwlwifi :04:00.0:CSR_GIO_CHICKEN_BITS: 0X27800200 [ 115.042064] iwlwifi :04:00.0: CSR_ANA_PLL_CFG: 0Xd5d5 [ 115.042067] iwlwifi :04:00.0: CSR_MONITOR_STATUS_REG: 0X3d0801bd [ 115.042070] iwlwifi :04:00.0: CSR_HW_REV_WA_REG: 0X0001001a [ 115.042074] iwlwifi :04:00.0:CSR_DBG_HPET_MEM_REG: 0X [ 115.042075] iwlwifi :04:00.0: FH register values: [ 115.042086] iwlwifi :04:00.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X455fd200 [ 115.042097] iwlwifi :04:00.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X04556370 [ 115.042108] iwlwifi :04:00.0: FH_RSCSR_CHNL0_WPTR: 0X0078 [ 115.042119] iwlwifi :04:00.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X00801114 [ 115.042129] iwlwifi :04:00.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0X00fc [ 115.042140] iwlwifi :04:00.0: FH_MEM_RSSR_RX_STATUS_REG: 0X0303 [ 115.042151] iwlwifi :04:00.0: FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X [ 115.042162] iwlwifi :04:00.0: FH_TSSR_TX_STATUS_REG: 0X07ff0001 [ 115.042173] iwlwifi :04:00.0: FH_TSSR_TX_ERROR_REG: 0X [ 115.042278] iwlwifi :04:00.0: Start IWL Error Log Dump: [ 115.042279] iwlwifi :04:00.0: Status: 0x, count: 6 [ 115.042280] iwlwifi :04:00.0: Loaded firmware version: 16.242414.0 [ 115.042281] iwlwifi :04:00.0: 0x0034 | NMI_INTERRUPT_WDG [ 115.042282] iwlwifi :04:00.0: 0x059002A0 | trm_hw_status0 [ 115.042283] iwlwifi :04:00.0: 0x | trm_hw_status1 [ 115.042284] iwlwifi :04:00.0:
Re: [GIT] Networking
Hi Emmanuel, Linus, On Wed, 2016-05-18 at 06:37 +0300, Emmanuel Grumbach wrote: > On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds >wrote: > > > > On Tue, May 17, 2016 at 12:11 PM, David Miller > > wrote: > > > > > > > > > Highlights: > > Lowlights: > > > > 1) the iwlwifi driver seems to be broken > > > > My laptop that uses the intel 7680 iwlwifi module no longer > > connects > > to the network. It fails with a "Microcode SW error detected." and > > spews out register state over and over again. > Can we have the register state and the ASSERT / NMI / whatever that > goes along with it? > This clearly means that the firmware is crashing, but I don't know > why, > I copied here the lines that I need from another bug with another > device with another firmware, > but the log that we will still explain what I need: I managed to reproduce this bug locally with Linus' master. I'm investigating the cause and I'll let you how it goes. -- Cheers, Luca.
Re: [GIT] Networking
On Wed, May 18, 2016 at 4:00 AM, Linus Torvaldswrote: > On Tue, May 17, 2016 at 12:11 PM, David Miller wrote: >> >> Highlights: > > Lowlights: > > 1) the iwlwifi driver seems to be broken > > My laptop that uses the intel 7680 iwlwifi module no longer connects > to the network. It fails with a "Microcode SW error detected." and > spews out register state over and over again. Can we have the register state and the ASSERT / NMI / whatever that goes along with it? This clearly means that the firmware is crashing, but I don't know why, I copied here the lines that I need from another bug with another device with another firmware, but the log that we will still explain what I need: [ 800.880402] iwlwifi :02:00.0: Start IWL Error Log Dump: [ 800.880406] iwlwifi :02:00.0: Status: 0x, count: 6 [ 800.880409] iwlwifi :02:00.0: Loaded firmware version: 21.311951.0 [ 800.880413] iwlwifi :02:00.0: 0x0394 | ADVANCED_SYSASSERT [ 800.880416] iwlwifi :02:00.0: 0x0220 | trm_hw_status0 [ 800.880419] iwlwifi :02:00.0: 0x | trm_hw_status1 [ 800.880422] iwlwifi :02:00.0: 0x0BD8 | branchlink2 [ 800.880425] iwlwifi :02:00.0: 0x00026AC4 | interruptlink1 [ 800.880428] iwlwifi :02:00.0: 0x | interruptlink2 [ 800.880431] iwlwifi :02:00.0: 0x0001 | data1 [ 800.880434] iwlwifi :02:00.0: 0x02039845 | data2 [ 800.880437] iwlwifi :02:00.0: 0x0056 | data3 [ 800.880440] iwlwifi :02:00.0: 0x8E4184A7 | beacon time [ 800.880443] iwlwifi :02:00.0: 0x30E2CB41 | tsf low [ 800.880446] iwlwifi :02:00.0: 0x0027 | tsf hi [ 800.880449] iwlwifi :02:00.0: 0x | time gp1 [ 800.880451] iwlwifi :02:00.0: 0x2F842F8A | time gp2 [ 800.880454] iwlwifi :02:00.0: 0x | uCode revision type [ 800.880457] iwlwifi :02:00.0: 0x0015 | uCode version major [ 800.880460] iwlwifi :02:00.0: 0x0004C28F | uCode version minor [ 800.880463] iwlwifi :02:00.0: 0x0201 | hw version [ 800.880466] iwlwifi :02:00.0: 0x00489008 | board version [ 800.880469] iwlwifi :02:00.0: 0x001C | hcmd [ 800.880472] iwlwifi :02:00.0: 0x24022000 | isr0 [ 800.880475] iwlwifi :02:00.0: 0x0100 | isr1 [ 800.880478] iwlwifi :02:00.0: 0x580A | isr2 [ 800.880481] iwlwifi :02:00.0: 0x4041FCC1 | isr3 [ 800.880483] iwlwifi :02:00.0: 0x | isr4 [ 800.880486] iwlwifi :02:00.0: 0x00800110 | last cmd Id [ 800.880489] iwlwifi :02:00.0: 0x | wait_event [ 800.880492] iwlwifi :02:00.0: 0x02C8 | l2p_control [ 800.880495] iwlwifi :02:00.0: 0x00018030 | l2p_duration [ 800.880498] iwlwifi :02:00.0: 0x00BF | l2p_mhvalid [ 800.880501] iwlwifi :02:00.0: 0x00EF | l2p_addr_match [ 800.880503] iwlwifi :02:00.0: 0x000D | lmpm_pmg_sel [ 800.880506] iwlwifi :02:00.0: 0x30031805 | timestamp [ 800.880509] iwlwifi :02:00.0: 0xE0F0 | flow_handler > > The last thing it says before falling over is: > > wlp1s0: authenticate with xx:xx:xx:xx:xx:xx > wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 1/3) > wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 2/3) > > and then it goes all titsup. > > I thought that it might be because I had downloaded one of the daily > firmware versions (it calls itself iwlwifi-7260-17.ucode, but isn't a > real release afaik - but it has worked fien for me before), but the > problem persists with the ver-16 ucode too, so that wasn't it. > > I haven't bisected it, but there is absolutely nothing odd in my hardware. > > I do have a 802.11ac network, which apparently not everybody does, > judging by previous bug-reports of mine.. > > Intel iwlwifi people: please check this out. > >Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Tue, May 17, 2016 at 12:11 PM, David Millerwrote: > > Highlights: Lowlights: 1) the iwlwifi driver seems to be broken My laptop that uses the intel 7680 iwlwifi module no longer connects to the network. It fails with a "Microcode SW error detected." and spews out register state over and over again. The last thing it says before falling over is: wlp1s0: authenticate with xx:xx:xx:xx:xx:xx wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 1/3) wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 2/3) and then it goes all titsup. I thought that it might be because I had downloaded one of the daily firmware versions (it calls itself iwlwifi-7260-17.ucode, but isn't a real release afaik - but it has worked fien for me before), but the problem persists with the ver-16 ucode too, so that wasn't it. I haven't bisected it, but there is absolutely nothing odd in my hardware. I do have a 802.11ac network, which apparently not everybody does, judging by previous bug-reports of mine.. Intel iwlwifi people: please check this out. Linus
Re: [GIT] Networking
On 3/19/2016 6:42 AM, David Miller wrote: There is a merge conflict against the RDMA tree pull wrt the mlx4 driver, which I don't know how to resolve. Can you please point on the conflict ? is it still open that needs our input ?
Re: [GIT] Networking
On Mon, Nov 09 2015, Hannes Frederic Sowawrote: > Hi, > > On Wed, Oct 28, 2015, at 15:27, Rasmus Villemoes wrote: >> >> I agree - proper overflow checking can be really hard. Quick, assuming a >> and b have the same unsigned integer type, is 'a+b> check overflow? Of course not (hint: promotion rules). And as you say, >> it gets even more complicated for signed types. >> >> A few months ago I tried posting a complete set of fallbacks for older >> compilers (https://lkml.org/lkml/2015/7/19/358), but nothing really >> happened. Now I know where Linus stands, so I guess I can just delete >> that branch. > > I actually like your approach of being type agnostic a bit more (in > comparison to static inline functions), mostly because of one specific > reason: > > The type agnostic __builtin_*_overflow function even do the correct > things if you deal with types smaller than int. Imagine e.g. you want to > add to unsigned chars a and b, If you read my mail again you'll see that I mentioned exactly this :-) so obviously I agree that this is a nice part of it. > unsigned char a, b; > if (a + b < a) > goto overflow; > else > a += b; > > The overflow condition will never trigger, as the comparisons will > always be done in the integer domain and a + b < a is never true. I > actually think that this is easy to overlook and the functions should > handle that. Yes. While people very rarely use local u8 or u16 variables for computations, I think one could imagine a and b being struct members, which for one reason or another happens to be of a type narrower than int (which would also make the issue much harder to spot since the struct definition is far away). Something like combine_packets(struct foo *a, const struct foo *b) { if (a->len + b->len < a->len) return -EOVERFLOW; /* ensure a->payload is big enough...*/ memcpy(a->payload + a->len, b->payload, b->len); a->len += b->len; ... } which, depending on details, would either lead to memory corruption or loss of parts of the packets. I haven't actually found any instance of this in the kernel, but that doesn't mean it couldn't get introduced (or that it doesn't exist). Aside: It turns out clang is smart enough to optimize away the broken overflow check, but gcc isn't. Neither issue a warning, despite the intention being rather clear. Rasmus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
Hello, Ingo Molnarwrites: > * Linus Torvalds wrote: > >> Does anybody have any particular other "uhhuh, overflow in multiplication" >> issues in mind? Because the interface for a saturating multiplication (or >> addition, for that matter) would actually be much easier. And would be >> trivial >> to have as an inline asm for compatibility with older versions of gcc too. >> >> Then you could just do that jiffies conversion - or allocation, for that >> matter >> - without any special overflow handling at all. Doing >> >> buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL); >> >> would just magically work. > > Exactly: saturation is the default behavior for many GPU vector/pixel > attributes > as well, to simplify and speed up the code and the hardware. I always wanted > our > ABIs to saturate instead of duplicating complexity with overflow failure > logic. I don't think saturation arithmetic is useful at all in the kernel as a replacement for overflow/wrap-around checks. Linus' example has a discrepancy between what the caller expects and the actual number of bytes allocated. Imagine sat_mul does the operation in signed char and kmalloc takes only signed chars as an argument, it could actually be a huge discrepancy that could lead to security vulnerabilities. The call should definitely error out here and not try to allocate memory of some different size and return it to the caller. > In the kernel the first point of failure is missing overflow checks. The > second > point of failure are buggy overflow checks. We can eliminate both if we just > use > safe operations that produce output that never exit the valid range. This > also > happens to result in the simplest code. We should start thinking of overflow > checks as rootkit enablers. Sorry, I don't understand that at all. sat_mul is a rootkit enabler, I fear. If you allocate a smalelr portion of memory as the caller actually asked for because of saturation logic, this definitely could lead to memory corruption and hard to diagnose bugs. > And note how much this simplifies review and static analysis: if this is the > dominant model used in new kernel code then the analysis (human or machine) > would > only have to ensure that no untrusted input values get multiplied (or added) > in an > unsafe way. It would not have to be able to understand and track any > 'overflow > logic' through a maze of return paths, and validate whether the 'overflow > logic' > is correct for all input parameter ranges... Sorry, I don't really understand that proposal. :/ > The flip side is marginally less ABI robustness: random input parameters due > to > memory corruption will just saturate and produce nonsensical results. I don't > think it's a big issue, and I also think the simplicity of input parameter > validation is _way_ more important than our behavior to random input - but > I've > been overruled in the past when trying to introduce saturating ABIs, so > saturation > is something people sometimes find inelegant. If those nonsensical results are memory corruptions I also don't agree. I think we need to be very much accurate when dealing with overflows. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
Hello, Ingo Molnarwrites: > * Linus Torvalds wrote: > >> Does anybody have any particular other "uhhuh, overflow in multiplication" >> issues in mind? Because the interface for a saturating multiplication (or >> addition, for that matter) would actually be much easier. And would be >> trivial >> to have as an inline asm for compatibility with older versions of gcc too. >> >> Then you could just do that jiffies conversion - or allocation, for that >> matter >> - without any special overflow handling at all. Doing >> >> buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL); >> >> would just magically work. > > Exactly: saturation is the default behavior for many GPU vector/pixel > attributes > as well, to simplify and speed up the code and the hardware. I always wanted > our > ABIs to saturate instead of duplicating complexity with overflow failure > logic. I don't think saturation arithmetic is useful at all in the kernel as a replacement for overflow/wrap-around checks. Linus' example has a discrepancy between what the caller expects and the actual number of bytes allocated. Imagine sat_mul does the operation in signed char and kmalloc takes only signed chars as an argument, it could actually be a huge discrepancy that could lead to security vulnerabilities. The call should definitely error out here and not try to allocate memory of some different size and return it to the caller. > In the kernel the first point of failure is missing overflow checks. The > second > point of failure are buggy overflow checks. We can eliminate both if we just > use > safe operations that produce output that never exit the valid range. This > also > happens to result in the simplest code. We should start thinking of overflow > checks as rootkit enablers. Sorry, I don't understand that at all. sat_mul is a rootkit enabler, I fear. If you allocate a smalelr portion of memory as the caller actually asked for because of saturation logic, this definitely could lead to memory corruption and hard to diagnose bugs. > And note how much this simplifies review and static analysis: if this is the > dominant model used in new kernel code then the analysis (human or machine) > would > only have to ensure that no untrusted input values get multiplied (or added) > in an > unsafe way. It would not have to be able to understand and track any > 'overflow > logic' through a maze of return paths, and validate whether the 'overflow > logic' > is correct for all input parameter ranges... Sorry, I don't really understand that proposal. :/ > The flip side is marginally less ABI robustness: random input parameters due > to > memory corruption will just saturate and produce nonsensical results. I don't > think it's a big issue, and I also think the simplicity of input parameter > validation is _way_ more important than our behavior to random input - but > I've > been overruled in the past when trying to introduce saturating ABIs, so > saturation > is something people sometimes find inelegant. If those nonsensical results are memory corruptions I also don't agree. I think we need to be very much accurate when dealing with overflows. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
* Linus Torvaldswrote: > Does anybody have any particular other "uhhuh, overflow in multiplication" > issues in mind? Because the interface for a saturating multiplication (or > addition, for that matter) would actually be much easier. And would be > trivial > to have as an inline asm for compatibility with older versions of gcc too. > > Then you could just do that jiffies conversion - or allocation, for that > matter > - without any special overflow handling at all. Doing > > buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL); > > would just magically work. Exactly: saturation is the default behavior for many GPU vector/pixel attributes as well, to simplify and speed up the code and the hardware. I always wanted our ABIs to saturate instead of duplicating complexity with overflow failure logic. In the kernel the first point of failure is missing overflow checks. The second point of failure are buggy overflow checks. We can eliminate both if we just use safe operations that produce output that never exit the valid range. This also happens to result in the simplest code. We should start thinking of overflow checks as rootkit enablers. And note how much this simplifies review and static analysis: if this is the dominant model used in new kernel code then the analysis (human or machine) would only have to ensure that no untrusted input values get multiplied (or added) in an unsafe way. It would not have to be able to understand and track any 'overflow logic' through a maze of return paths, and validate whether the 'overflow logic' is correct for all input parameter ranges... The flip side is marginally less ABI robustness: random input parameters due to memory corruption will just saturate and produce nonsensical results. I don't think it's a big issue, and I also think the simplicity of input parameter validation is _way_ more important than our behavior to random input - but I've been overruled in the past when trying to introduce saturating ABIs, so saturation is something people sometimes find inelegant. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
Hi, On Wed, Oct 28, 2015, at 15:27, Rasmus Villemoes wrote: > On Wed, Oct 28 2015, Hannes Frederic Sowa> wrote: > > > Hi Linus, > > > > On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote: > >> Get rid of it. And I don't *ever* want to see that shit again. > > > > I don't want to give up on that this easily: > > > > In future I would like to see an interface like this. It is often hard > > to do correct overflow/wrap-around tests and it would be great if there > > are helper functions which could easily and without a lot of thinking be > > used by people to remove those problems from the kernel. > > I agree - proper overflow checking can be really hard. Quick, assuming a > and b have the same unsigned integer type, is 'a+b check overflow? Of course not (hint: promotion rules). And as you say, > it gets even more complicated for signed types. > > A few months ago I tried posting a complete set of fallbacks for older > compilers (https://lkml.org/lkml/2015/7/19/358), but nothing really > happened. Now I know where Linus stands, so I guess I can just delete > that branch. I actually like your approach of being type agnostic a bit more (in comparison to static inline functions), mostly because of one specific reason: The type agnostic __builtin_*_overflow function even do the correct things if you deal with types smaller than int. Imagine e.g. you want to add to unsigned chars a and b, unsigned char a, b; if (a + b < a) goto overflow; else a += b; The overflow condition will never trigger, as the comparisons will always be done in the integer domain and a + b < a is never true. I actually think that this is easy to overlook and the functions should handle that. The macro version does this quite nicely. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Fri, Nov 6, 2015 at 7:27 AM, David Laightwrote: >> From: Linus Torvalds >> Sent: 03 November 2015 20:45 >> On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvalds >> wrote: >> > result = add_overflow( >> > mul_overflow(sec, SEC_CONVERSION, ), >> > mul_overflow(nsec, NSEC_CONVERSION, ), >> > ); >> > >> > return overflow ? MAX_JIFFIES : result; >> >> Thinking more about this example, I think the gcc interface for >> multiplication overflow is fine. >> >> It would end up something like >> >> if (mul_overflow(sec, SEC_CONVERSION, )) >> return MAX_JIFFY_OFFSET; >> if (mul_overflow(nsec, NSEC_CONVERSION, )) >> return MAX_JIFFY_OFFSET; >> sum = sec + nsec; >> if (sum < sec || sum > MAX_JIFFY_OFFSET) >> return MAX_JIFFY_OFFSET; >> return sum; >> >> and that doesn't look horribly ugly to me. > > If mul_overflow() is a real function you've just forced some of the > values out to memory, generating a 'clobber' for all memory > (unless 'strict-aliasing' is enabled) and making a mess of other > optimisations. > (If it is a static inline that might not happen.) I doubt anyone would ever make it a real function. On new gcc, it would be an inline backed by an intrinsic. On old gcc it would be a normal inline or perhaps an inline with inline asm in it. --Andy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [GIT] Networking
> From: Linus Torvalds > Sent: 03 November 2015 20:45 > On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvalds >wrote: > > result = add_overflow( > > mul_overflow(sec, SEC_CONVERSION, ), > > mul_overflow(nsec, NSEC_CONVERSION, ), > > ); > > > > return overflow ? MAX_JIFFIES : result; > > Thinking more about this example, I think the gcc interface for > multiplication overflow is fine. > > It would end up something like > > if (mul_overflow(sec, SEC_CONVERSION, )) > return MAX_JIFFY_OFFSET; > if (mul_overflow(nsec, NSEC_CONVERSION, )) > return MAX_JIFFY_OFFSET; > sum = sec + nsec; > if (sum < sec || sum > MAX_JIFFY_OFFSET) > return MAX_JIFFY_OFFSET; > return sum; > > and that doesn't look horribly ugly to me. If mul_overflow() is a real function you've just forced some of the values out to memory, generating a 'clobber' for all memory (unless 'strict-aliasing' is enabled) and making a mess of other optimisations. (If it is a static inline that might not happen.) If you assume that no one is stupid enough to multiply very large values by 1 and not get an error you could have mul_overflow() return the largest prime if the multiply overflowed. David
Re: [GIT] Networking
Hello, On Tue, Nov 3, 2015, at 03:38, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 5:58 PM, Andy Lutomirski> wrote: > > > > Based in part on an old patch by Sasha, what if we relied on CSE: > > > > if (mul_would_overflow(size, n)) > > return NULL; > > do_something_with(size * n); > > I suspect we wouldn't even have to rely on CSE. Are these things in > performance-critical areas? I suspect our current "use divides" is > actually slower than just using two multiplies, even if one is only > used for overflow checking. And furthermore we don't actually have to rely on CSE if we want to, our overflow checks could look much more simpler as in "ordinary" C code because we tell the compiler that signed overflow is defined throughout the kernel ( -fno-strict-overflow). Thus the checks can be done after the calculations. > That said, I also think that for something like this, where we > actually have a *reason* for using a special overflow helper function, > we could just try to use the gcc syntax. > > I don't think it's wonderful syntax, but at least there's an excuse > for odd/ugly code in those kinds of situations. The reason I hated the > unsigned subtraction case so much was that the simple obvious code > just avoids all those issues entirely, and there wasn't any real > *reason* for the nasty syntax. For multiplication overflow, we'd at > least have a reason. > > Sadly, the *nice* syntax, where we'd do something like "goto label" > when the multiply overflows, does not mesh well with inline asm. Yes, > gcc now has "asm goto", but you can't use it with an output value ;( I don't understand why you consider inline asm? Those builtins already normally produce very reasonable code (and yes, I checked). We can wrap the gcc builtins anyway and adapt the syntax as needed. inline asm does prohibit constant folding etc, so a __builtin_constant_p check would be necessary or helpful further adding complexity. > But from a syntactic standpoint, the best syntax might actually be > something like > > mul = multiply_with_overflow(size, n, error_label); > do_something_with(mul); > > error_label: > return NULL; > > and it would *almost* be possible to do this with inline asm if it > wasn't for the annoying "no output values" case. There are many other > cases where I'd have wanted to do this (ie the whole "fetch value from > user space, but if an exception happens, point the exception handler > at the label). I don't see the problem with the if (multiply_with_overflow(...)) overflowed_handle_error(...); multiply_with_overflow can have a __must_check attribute, so we see warnings if return value is not checked immediately. It allows chaining easily if (multiply_with_overflow(...) || multiply_with_overflow(...)) goto overflow; without adding checks between the different stages or calculation. It just composes nicely. The error handling is very explicit. > Back in May, we talked to the gcc people about allowing output values > that are unspecified for the "goto" case (so only a fallthrough would > have them set), but I think that that doesn't match how gcc internally > thinks about branching instructions.. > > But you could still hide it inside a macro and make it expand to > something like > >#define multiply_with_overflow(size, n, error_label) ({ > unsigned long result, error; \ > .. do multiply in asm, set result and error... \ > if (error) goto error_label; > result; }) > > which would allow the above kind of odd hand-coded try/catch model in > C. Which I think would be pretty understandable and not very prone to > getting it wrong. Hmm? Hiding branches in macros seems not to be a good idea to me at all. I actually think a lot of users in functions would simply check their arguments and return -EINVAL in case they overflow. Forcing them to do a jump seems inappropriate. I also don't think that reordering the arguments makes a lot of sense: bool overflow; int a = multiply_with_overflow(b, c, ); if (overflow) error out; This scheme might be composable if we ||= the overflow flag in the helper functions/macros and force the user to initialize the overflow boolean it to false in the beginning. Way too many things that can go wrong and an auditor has to verify. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Tue, Nov 3, 2015 at 4:53 AM, Hannes Frederic Sowawrote: > > And furthermore we don't actually have to rely on CSE if we want to, our > overflow checks could look much more simpler as in "ordinary" C code > because we tell the compiler that signed overflow is defined throughout > the kernel ( -fno-strict-overflow). Thus the checks can be done after > the calculations. Yes. We could actually do overflow checks by just verifying the result, even for signed stuff. > I don't understand why you consider inline asm? Those builtins already > normally produce very reasonable code Those built-ins only exist with gcc-5+, afaik. We'll have people who rely on old versions of gcc for *years* after gcc5 is commonly available. They'll be running enterprise distros or debian-stable or things like that. So we do need to have reasonable backwards compatibility functions. For things like multiplication overflow, inline asm may be the best way to do that. That said, we'll be able to work around it, I'm sure. But no, we're not going to be in the situation where we just know we can use the builtins. > I don't see the problem with the > > if (multiply_with_overflow(...)) > overflowed_handle_error(...); I do agree that it's likely not a big issue. That said, I may be influenced by hardware design, but I think I'm also influenced by traditional good C rules: I like functions that return the *result*, so that the result can be used in a chain of calculations. Like hardware, the "overflow" bit is separate and I actually think the gcc overflow functions did the calling convention wrong. So even if we do the "pass one of the results by reference" thing, I'd much rather that "pas by reference" be for the overflow condition. And hardware that does it well tends to not just give an "overflow" result, but a "summary overflow", so that you can do multiple operations in series, and then just check the "summary overflow" at the end. So my gut feel is that overflow should either be an exception (ie the whole "jump to another place" model), or it should be a flag value, but it shouldn't be the "result" of the function. For example, one of the overflow issues we've had occasionally has been not about a single op, but a series of operations: "multiply-and-add". Look at __timespec64_to_jiffies(), for example, where the operation that can overflow is "seconds * SEC_CONVERSION + nsec * NSEC_CONVERSION". Now, in that case we currently handle the overflow by just knowing that 'nsec' had better follow certain rules, so we can simply check seconds against a known maximum, and we don't need to get the "exact" overflow condition. And quite honestly, that may end up *always* being the right thing to do - there just isn't any real reason to worry about individual operations overflowing. But imagine that we did. The "summary overflow" interface would allow us to do something like bool overflow = 0; result = add_overflow( mul_overflow(sec, SEC_CONVERSION, ), mul_overflow(nsec, NSEC_CONVERSION, ), ); return overflow ? MAX_JIFFIES : result; which I'm not at all actually advocating (because (a) I think the current code is simpler and (b) I don't like the silly "add_overflow()" anyway), but that I'm giving as an example of why I think the gcc builtin result passing choice looks a bit odd to me. > multiply_with_overflow can have a __must_check attribute, so we see > warnings if return value is not checked immediately. Yes. There may be advantages to that too. That said, I'm not seeing that as a big deal. If you use the overflow functions and don't check the overflow condition, you kind of have bigger issues than "I'd like to get a compiler warning". That's more of a "WTF is the person doing" thing). Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvaldswrote: > result = add_overflow( > mul_overflow(sec, SEC_CONVERSION, ), > mul_overflow(nsec, NSEC_CONVERSION, ), > ); > > return overflow ? MAX_JIFFIES : result; Thinking more about this example, I think the gcc interface for multiplication overflow is fine. It would end up something like if (mul_overflow(sec, SEC_CONVERSION, )) return MAX_JIFFY_OFFSET; if (mul_overflow(nsec, NSEC_CONVERSION, )) return MAX_JIFFY_OFFSET; sum = sec + nsec; if (sum < sec || sum > MAX_JIFFY_OFFSET) return MAX_JIFFY_OFFSET; return sum; and that doesn't look horribly ugly to me. That said, I do wonder how many of our interfaces really want overflow, and how many just want saturation (or even clamping to a maximum value). For example, one of the *common* cases of multiplication overflow we have had is for memory allocation where we do things like buffer = kmalloc(sizeof(something) * nr, GFP_KERNEL); and we've fixed them by moving them to 'kcalloc()'. But as with the jiffies conversion above, it would actually be sufficient to just saturate to a maximum value instead, and depending on that causing the allocation to fail. So it may actually be that most users really don't even *want* "overflow". Does anybody have any particular other "uhhuh, overflow in multiplication" issues in mind? Because the interface for a saturating multiplication (or addition, for that matter) would actually be much easier. And would be trivial to have as an inline asm for compatibility with older versions of gcc too. Then you could just do that jiffies conversion - or allocation, for that matter - without any special overflow handling at all. Doing buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL); would just magically work. And the above jiffies conversion would still want to clamp things to MAX_JIFFY_OFFSET (because we consider "jiffies" to be an offset from now, and while it's "unsigned long", we clamp the offset to half the range), but it would still be a rather natural model for it too. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On 10/28/2015 02:39 AM, Linus Torvalds wrote: I'm sorry, but we don't add idiotic new interfaces like this for idiotic new code like that. As one of the people who encouraged gcc to add this interface, I'll speak up in its favor: Getting overflow checking right in more complicated cases is a PITA. I'll admit that the "subtract from an unsigned integer if it won't go negative" isn't particularly useful, but there are other cases in which it's much more useful. The one I care about the most is for multiplication. Witness the never-ending debates about the proper way to implement things like kmalloc_array. We currently do: static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags) { if (size != 0 && n > SIZE_MAX / size) return NULL; return __kmalloc(n * size, flags); } This is correct, and it's even reasonably efficient if size is a compile-time constant. (On x86, it still might not be quite optimal, since there'll be an extra cmp instruction. Sure, the difference could easily be a cycle or even less.) But if size is not a constant, then, unless the compiler is quite clever, this ends up generating a division, and that sucks. If we were willing to do: size_t total_bytes; #if efficient_overflow_detection_works if (__builtin_mul_overflow(n, size, _bytes)) return NULL; #else /* existing check goes here */ total_bytes = n * size; #endif return __kmalloc(n * size, flags); then we get optimal code generation on new compilers and the result isn't even that ugly to look at. For compiler flag settings in which signed overflow can cause subtle disasters, the signed addition overflow helpers can be nice, too. --Andy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvaldswrote: > On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski wrote: >> >> Getting overflow checking right in more complicated cases is a PITA. > > No it is not. Not for unsigned values. Just to clarify. The "oevrflow" test for unsigned subtracts of "a-b" (it's really an underflow, but whatever) really is just (b > a) Really. That's it. Claiming that that is "complicated" and needs a helper function is not something sane people do. A fifth-grader that isn't good at math can understand that. In contrast, nobody sane understands "usub_overflow(a, b, )". So really. Stop making inane arguments. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
Hello, On Mon, Nov 2, 2015, at 22:30, Andy Lutomirski wrote: > On Mon, Nov 2, 2015 at 1:19 PM, Linus Torvalds >wrote: > > On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvalds > > wrote: > >> On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski wrote: > >>> > >>> Getting overflow checking right in more complicated cases is a PITA. > >> > >> No it is not. Not for unsigned values. > > > > Just to clarify. The "oevrflow" test for unsigned subtracts of "a-b" > > (it's really an underflow, but whatever) really is just > > > >(b > a) > > > > Really. That's it. Claiming that that is "complicated" and needs a > > helper function is not something sane people do. A fifth-grader that > > isn't good at math can understand that. > > > > In contrast, nobody sane understands "usub_overflow(a, b, )". > > > > So really. Stop making inane arguments. > > I'll stop making inane arguments if you stop bashing arguments I > didn't make. :) I said the helpers were useful for multiplication (by > which I meant both signed and unsigned) and, to a lesser extent, for > signed addition and subtraction. > > I don't believe I even tried to justify usub_overflow as anything > other than an extremely minor optimization that probably isn't > worthwhile. overflow_usub was part of a larger header I already prepared to offer support for *all* overflow_* checking builtins. While fixing this IPv6 bug I thought I could hopefully introduce this interface slowly and simply cut away the other versions. The reasoning from my side, while I totally agree that unsigned add/sub is very easy to test for, is that after using those builtins for a while I absolutely don't consider them in any way unpleasant to read (but mainly with signed operations, I have to admit). They do the one operation and if something overflows or wraps around, error out and enforce the failure case on the true branch. I simply like the name 'overflow' (also it could be changed to wraparound in unsigned operations) in the code. Also chaining multiple operations where each one could overflow can simply be put into a single if with short-eval or. Simply, my point in submitting overflow_usub was to provide all helpers sometime soon in this way but not submit any without users. I simply was working down the list. > --Andy, who still has inline asm that does 'cmovo' and such in his > code for work, sigh. In the past I also prepared some inline asm functions with __builtin_constant_p and inline asm to do those checks but considered them to complicated for submission. ;) Linus, I am totally fine with leaving out the usub and uadd operations but hope you are willing to accept the multiplication and signed add/sub versions. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 1:19 PM, Linus Torvaldswrote: > On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvalds > wrote: >> On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski wrote: >>> >>> Getting overflow checking right in more complicated cases is a PITA. >> >> No it is not. Not for unsigned values. > > Just to clarify. The "oevrflow" test for unsigned subtracts of "a-b" > (it's really an underflow, but whatever) really is just > >(b > a) > > Really. That's it. Claiming that that is "complicated" and needs a > helper function is not something sane people do. A fifth-grader that > isn't good at math can understand that. > > In contrast, nobody sane understands "usub_overflow(a, b, )". > > So really. Stop making inane arguments. I'll stop making inane arguments if you stop bashing arguments I didn't make. :) I said the helpers were useful for multiplication (by which I meant both signed and unsigned) and, to a lesser extent, for signed addition and subtraction. I don't believe I even tried to justify usub_overflow as anything other than an extremely minor optimization that probably isn't worthwhile. --Andy, who still has inline asm that does 'cmovo' and such in his code for work, sigh. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirskiwrote: > On 10/28/2015 02:39 AM, Linus Torvalds wrote: >> >> I'm sorry, but we don't add idiotic new interfaces like this for >> idiotic new code like that. > > > As one of the people who encouraged gcc to add this interface, I'll speak up > in its favor: > > Getting overflow checking right in more complicated cases is a PITA. No it is not. Not for unsigned values. Stop this idiocy. Yes, overflow for signed integers can be complex. But not for unsigned ones. And that disgusting "overflow_usub()" in no way makes the code more readable. EVER. So stop just making things up. A helper function *could* have been more legible and more efficient, if it had been about something completely different. But in this case it really wasn't. It wasn't more efficient, it wasn't more legible, and it simply had no excuse for it. Stop making excuses for shit. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 2:14 PM, Hannes Frederic Sowawrote: > > overflow_usub was part of a larger header I already prepared to offer > support for *all* overflow_* checking builtins. While fixing this IPv6 > bug I thought I could hopefully introduce this interface slowly and > simply cut away the other versions. Hell no. Both you and Andy seem to argue that "since there are other totally unrelated functions that look superficially similar and actually some sense, we should add these stupid crap functions too". In exactly *WHAT* crazy universe does that make sense as an argument? It's like saying "I put literal shit on your plate, because there are potentially nutritious sausages that look superficially a bit like the dogshit I served you". Seriously. The fact that _valid_ overflow checking functions exist in _no_ way support the crap that I got. It's *exactly* the same argument as "dog poop superficially looks like good sausages". Is that really your argument? There is never an excuse for "usub_overflow()". It's that simple. No amount of _other_ overflow functions make that shit palatable. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, 2015-11-02 at 13:30 -0800, Andy Lutomirski wrote: > > I'll stop making inane arguments if you stop bashing arguments I > didn't make. :) I said the helpers were useful for multiplication (by > which I meant both signed and unsigned) and, to a lesser extent, for > signed addition and subtraction. > > I don't believe I even tried to justify usub_overflow as anything > other than an extremely minor optimization that probably isn't > worthwhile. Also how much of the problem is simply that the function signature (naming and choice of arguments) just plain sucks ? Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 4:56 PM, Benjamin Herrenschmidtwrote: > > Also how much of the problem is simply that the function signature > (naming and choice of arguments) just plain sucks ? Some of that is pretty much inevitable. C really has no good way to return multiple values. The traditional (pass pointer to fill in result) one simply doesn't result in good-looking code. We've occasionally done it by simply breaking C syntax: see "do_div()" (which returns a value *and* just changes the first argument directly as a macro). People have tended to absolutely hate it, and while it can be very practical, it has certainly also resulted in people being confused. It was originally done for printing numbers, where the whole "return remainder and divide argument" model was really convenient. Sometimes we've done it by knowing the value space: the whole "return error value _or_ a resulting pointer value" by just knowing the domains (ie "ERR_PTR()" end friends). That tends to work really badly for arithmetic overflows, though. And at other times, we've returned structures, which can efficiently contain two words, and gcc generates reasonable code for. The *natural* thing to do would actually be to trap and set a flag. We do that with put_user_try { ... } put_user_catch(err); which sets "err" if one of the "put_user_ex()" or calls in between traps. The "try/catch" model would probably be the best one syntactically for overflow handling. It could even be done with macros (ie the "catch" thing would contain a special overflow label, and the "overflow functions" would then just jump to that labeln in the error case as a way to avoid the "return two different values" thing). Of course, try/catch only really makes sense if you have multiple operations that can overflow in different parts. That's can be the pattern in many other cases, but for the kernel it's quite unusual. It's seldom more than one single operation we need to worry about in any particular sequence (the "put_user_try/catch" use we have is exactly because signal handling writes multiple different values to the same stack when it generates the stack frame). And with all that said, realistically, in the kernel we seldom have a ton of complex overflow issues. Most of the values we have are unsigned, and we have historically just done them manually with sum = a+b; if (sum < a) .. handle error .. which really doesn't get much better from any syntactic stuff around it (because any other syntax will involve the whole "how do I return two values" problem and make it less legible). Gcc is even smart enough to turn that into a "just check the carry flag" if the patterns are simple enough, so the simple approach can even generate optimal code. The biggest problem - and where the compiler could actually help us - tends to be multiplication overflows. We have several (not *many*, but certainly more than just a couple) cases where we simply check by dividing MAX_INT or something. See for example kmalloc_array(), which does if (size != 0 && n > SIZE_MAX / size) return NULL; exactly to avoid the overflow when it does the "n*size" allocation. So for multiplication, we really *could* use overflow logic. It's not horribly common, but it definitely happens. Signed integer overflows are annoying even for simple addition and subtraction, but I can't off-hand recall any real case where that was even an issue in the kernel. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 5:54 PM, Linus Torvaldswrote: > > The biggest problem - and where the compiler could actually help us - > tends to be multiplication overflows. We have several (not *many*, but > certainly more than just a couple) cases where we simply check by > dividing MAX_INT or something. > > See for example kmalloc_array(), which does > > if (size != 0 && n > SIZE_MAX / size) > return NULL; > > exactly to avoid the overflow when it does the "n*size" allocation. > > So for multiplication, we really *could* use overflow logic. It's not > horribly common, but it definitely happens. > Based in part on an old patch by Sasha, what if we relied on CSE: if (mul_would_overflow(size, n)) return NULL; do_something_with(size * n); I haven't checked, but it would be sad if gcc couldn't optimize this correctly if we use the builtins. The downside is that I don't see off the top of my head how this could be implemented using inline asm if we want a fast fallback when the builtins aren't available. --Andy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Mon, Nov 2, 2015 at 5:58 PM, Andy Lutomirskiwrote: > > Based in part on an old patch by Sasha, what if we relied on CSE: > > if (mul_would_overflow(size, n)) > return NULL; > do_something_with(size * n); I suspect we wouldn't even have to rely on CSE. Are these things in performance-critical areas? I suspect our current "use divides" is actually slower than just using two multiplies, even if one is only used for overflow checking. That said, I also think that for something like this, where we actually have a *reason* for using a special overflow helper function, we could just try to use the gcc syntax. I don't think it's wonderful syntax, but at least there's an excuse for odd/ugly code in those kinds of situations. The reason I hated the unsigned subtraction case so much was that the simple obvious code just avoids all those issues entirely, and there wasn't any real *reason* for the nasty syntax. For multiplication overflow, we'd at least have a reason. Sadly, the *nice* syntax, where we'd do something like "goto label" when the multiply overflows, does not mesh well with inline asm. Yes, gcc now has "asm goto", but you can't use it with an output value ;( But from a syntactic standpoint, the best syntax might actually be something like mul = multiply_with_overflow(size, n, error_label); do_something_with(mul); error_label: return NULL; and it would *almost* be possible to do this with inline asm if it wasn't for the annoying "no output values" case. There are many other cases where I'd have wanted to do this (ie the whole "fetch value from user space, but if an exception happens, point the exception handler at the label). Back in May, we talked to the gcc people about allowing output values that are unspecified for the "goto" case (so only a fallthrough would have them set), but I think that that doesn't match how gcc internally thinks about branching instructions.. But you could still hide it inside a macro and make it expand to something like #define multiply_with_overflow(size, n, error_label) ({ unsigned long result, error; \ .. do multiply in asm, set result and error... \ if (error) goto error_label; result; }) which would allow the above kind of odd hand-coded try/catch model in C. Which I think would be pretty understandable and not very prone to getting it wrong. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
From: David MillerDate: Thu, 29 Oct 2015 08:19:41 -0700 (PDT) > This is the same as the previous pull request, with the ipv6 overflow > fix redone. The merge conflict is therefore gone too. ... > Please pull, thanks a lot. > > The following changes since commit 1099f86044111e9a7807f09523e42d4c9d0fb781: > > Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2015-10-19 > 09:55:40 -0700) Ping? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
From: Linus TorvaldsDate: Wed, 28 Oct 2015 18:39:56 +0900 > Get rid of it. And I don't *ever* want to see that shit again. No problem, I'll revert it all. I asked Hannes to repost his patches to linux-kernel hoping someone would review and say it stunk or not, give him some feedback, or whatever, and nobody reviewed the changes at all... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Wed, Oct 28 2015, Hannes Frederic Sowawrote: > Hi Linus, > > On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote: >> Get rid of it. And I don't *ever* want to see that shit again. > > I don't want to give up on that this easily: > > In future I would like to see an interface like this. It is often hard > to do correct overflow/wrap-around tests and it would be great if there > are helper functions which could easily and without a lot of thinking be > used by people to remove those problems from the kernel. I agree - proper overflow checking can be really hard. Quick, assuming a and b have the same unsigned integer type, is 'a+b
Re: [GIT] Networking
On Wed, Oct 28, 2015 at 3:32 PM, David Millerwrote: > > This may look a bit scary this late in the release cycle, but as is typically > the case it's predominantly small driver fixes all over the place. Christ people. This is just sh*t. The conflict I get is due to stupid new gcc header file crap. But what makes me upset is that the crap is for completely bogus reasons. This is the old code in net/ipv6/ip6_output.c: mtu -= hlen + sizeof(struct frag_hdr); and this is the new "improved" code that uses fancy stuff that wants magical built-in compiler support and has silly wrapper functions for when it doesn't exist: if (overflow_usub(mtu, hlen + sizeof(struct frag_hdr), ) || mtu <= 7) goto fail_toobig; and anybody who thinks that the above is (a) legible (b) efficient (even with the magical compiler support) (c) particularly safe is just incompetent and out to lunch. The above code is sh*t, and it generates shit code. It looks bad, and there's no reason for it. The code could *easily* have been done with just a single and understandable conditional, and the compiler would actually have generated better code, and the code would look better and more understandable. Why is this not if (mtu < hlen + sizeof(struct frag_hdr) + 8) goto fail_toobig; mtu -= hlen + sizeof(struct frag_hdr); which is the same number of lines, doesn't use crazy helper functions that nobody knows what they do, and is much more obvious what it actually does. I guarantee that the second more obvious version is easier to read and understand. Does anybody really want to dispute this? Really. Give me *one* reason why it was written in that idiotic way with two different conditionals, and a shiny new nonstandard function that wants particular compiler support to generate even half-way sane code, and even then generates worse code? A shiny function that we have never ever needed anywhere else, and that is just compiler-masturbation. And yes, you still could have overflow issues if the whole "hlen + xyz" expression overflows, but quite frankly, the "overflow_usub()" code had that too. So if you worry about that, then you damn well didn't do the right thing to begin with. So I really see no reason for this kind of complete idiotic crap. Tell me why. Because I'm not pulling this kind of completely insane stuff that generates conflicts at rc7 time, and that seems to have absolutely no reason for being anm idiotic unreadable mess. The code seems *designed* to use that new "overflow_usub()" code. It seems to be an excuse to use that function. And it's a f*cking bad excuse for that braindamage. I'm sorry, but we don't add idiotic new interfaces like this for idiotic new code like that. Yes, yes, if this had stayed inside the network layer I would never have noticed. But since I *did* notice, I really don't want to pull this. In fact, I want to make it clear to *everybody* that code like this is completely unacceptable. Anybody who thinks that code like this is "safe" and "secure" because it uses fancy overflow detection functions is so far out to lunch that it's not even funny. All this kind of crap does is to make the code a unreadable mess with code that no sane person will ever really understand what it actually does. Get rid of it. And I don't *ever* want to see that shit again. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
Hi Linus, On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote: > On Wed, Oct 28, 2015 at 3:32 PM, David Miller> wrote: > > > > This may look a bit scary this late in the release cycle, but as is > > typically > > the case it's predominantly small driver fixes all over the place. > > Christ people. This is just sh*t. > > The conflict I get is due to stupid new gcc header file crap. But what > makes me upset is that the crap is for completely bogus reasons. > > This is the old code in net/ipv6/ip6_output.c: > > mtu -= hlen + sizeof(struct frag_hdr); > > and this is the new "improved" code that uses fancy stuff that wants > magical built-in compiler support and has silly wrapper functions for > when it doesn't exist: > > if (overflow_usub(mtu, hlen + sizeof(struct frag_hdr), ) || > mtu <= 7) > goto fail_toobig; > > and anybody who thinks that the above is > > (a) legible > (b) efficient (even with the magical compiler support) > (c) particularly safe I still want to present an argument in favor of those overflow functions: On a very quick look it is obvious that someone cared about wrap-around or overflow on this code without diagnosing the checks above. And that was the reason I tried to use it. > is just incompetent and out to lunch. > > The above code is sh*t, and it generates shit code. It looks bad, and > there's no reason for it. Yes, overflow_usub is a bad example where IMHO the compiler cannot improve a lot. I think it gets more interesting in case of signed integers where the compiler can simply generate a seto instruction instead of manually checking the input variables for ranges before doing the calculation. E.g. especially for multiplication it is quite clear. > The code could *easily* have been done with just a single and > understandable conditional, and the compiler would actually have > generated better code, and the code would look better and more > understandable. Why is this not > > if (mtu < hlen + sizeof(struct frag_hdr) + 8) > goto fail_toobig; > mtu -= hlen + sizeof(struct frag_hdr); > > which is the same number of lines, doesn't use crazy helper functions > that nobody knows what they do, and is much more obvious what it > actually does. > > I guarantee that the second more obvious version is easier to read and > understand. Does anybody really want to dispute this? When reading through the code I have to jump back from the third line back to the first one just to check if the lengths adds up. I absolutely see your point, but I don't find the overflow_* helpers horrid but still useful. If one is used to how the arguments line up on the overflow helpers I find them quite easy to read. > Really. Give me *one* reason why it was written in that idiotic way > with two different conditionals, and a shiny new nonstandard function > that wants particular compiler support to generate even half-way sane > code, and even then generates worse code? A shiny function that we > have never ever needed anywhere else, and that is just > compiler-masturbation. I agree as a bugfix I could have find a simpler solution. > And yes, you still could have overflow issues if the whole "hlen + > xyz" expression overflows, but quite frankly, the "overflow_usub()" > code had that too. So if you worry about that, then you damn well > didn't do the right thing to begin with. Sure, there was no need to test against that because it couldn't. > So I really see no reason for this kind of complete idiotic crap. > > Tell me why. Because I'm not pulling this kind of completely insane > stuff that generates conflicts at rc7 time, and that seems to have > absolutely no reason for being anm idiotic unreadable mess. I can understand that and will fix it up asap for rc7. > The code seems *designed* to use that new "overflow_usub()" code. It > seems to be an excuse to use that function. Somehow I feel a bit guilty here. :) Actually I do find them quite handy. > And it's a f*cking bad excuse for that braindamage. > > I'm sorry, but we don't add idiotic new interfaces like this for > idiotic new code like that. > > Yes, yes, if this had stayed inside the network layer I would never > have noticed. But since I *did* notice, I really don't want to pull > this. In fact, I want to make it clear to *everybody* that code like > this is completely unacceptable. Anybody who thinks that code like > this is "safe" and "secure" because it uses fancy overflow detection > functions is so far out to lunch that it's not even funny. All this > kind of crap does is to make the code a unreadable mess with code that > no sane person will ever really understand what it actually does. > > Get rid of it. And I don't *ever* want to see that shit again. I don't want to give up on that this easily: In future I would like to see an interface like this. It is often hard to do correct overflow/wrap-around tests and it
Re: [GIT] Networking
On Sep 8 22:16, Konrad Rzeszutek Wilk wrote: > On Tue, Sep 8, 2015 at 10:14 PM, Konrad Rzeszutek Wilk >wrote: > > > > (Removed Linus and Andrew from the To, added Corinna ..) > > and resending again without HTML (sorry, thought I had HTML-emails > disabled by default) > > > > On Thu, Sep 3, 2015 at 1:35 AM, David Miller wrote: > >> > >> > >> Another merge window, another set of networking changes. I've heard > >> rumblings that the lightweight tunnels infrastructure has been voted > >> networking change of the year. But what do I know? > >> > > .. snip.. > >> > >> Corinna Vinschen (2): > >> r8169: Add values missing in @get_stats64 from HW counters Thanks, this has been reported over the weekend in https://bugzilla.kernel.org/show_bug.cgi?id=104031 Strangly this didn't occur during my tests. Looking into it... Corinna pgpQbD31yZa48.pgp Description: PGP signature
Re: [GIT] Networking
On Tue, Sep 8, 2015 at 10:14 PM, Konrad Rzeszutek Wilkwrote: > > (Removed Linus and Andrew from the To, added Corinna ..) and resending again without HTML (sorry, thought I had HTML-emails disabled by default) > > On Thu, Sep 3, 2015 at 1:35 AM, David Miller wrote: >> >> >> Another merge window, another set of networking changes. I've heard >> rumblings that the lightweight tunnels infrastructure has been voted >> networking change of the year. But what do I know? >> > .. snip.. >> >> Corinna Vinschen (2): >> r8169: Add values missing in @get_stats64 from HW counters > > > .. cases an regression when SWIOTLB is in use. (See full attached serial > log), but > the relevant snippet is: > > [ 12.010065] BUG: sleeping function called from invalid context at > /home/konrad/ssd/konrad/linux/mm/page_alloc.c:3186 > [ 12.88021e49b938 8174ff50 8800c6637748 > [ 12.051548] ttyS1: 4 input overrun(s) > [ 12.055485] 8800c6637140 88021e49b958 810cf22e > 0001^G^G^G^G^G^G^G^G > [ 12.064566] 88021e49b988^G^G^G^G^G^G^G^G > 810cf2cd 88021e49b9b4 > [ 12.073639] Call Trace: > [ 12.076156] [] dump_stack+0x4f/0x68 > [ 12.081444] [] ___might_sleep+0xde/0x130 > [ 12.087176] [] __might_sleep+0x4d/0x90 > [ 12.092731] [] __alloc_pages_nodemask+0x26f/0xa20 > [ 12.099271] [] ? create_object+0x21e/0x2c0 > [ 12.105183] [] ? kmemleak_alloc+0x23/0x40 > [ 12.111006] [] ? kmem_cache_alloc_trace+0x184/0x190 > [ 12.117721] [] ? kmemleak_alloc+0x23/0x40 > [ 12.123546] [] dma_generic_alloc_coherent+0x9d/0x140 > [ 12.130354] [] x86_swiotlb_alloc_coherent+0x30/0x60 > [ 12.137072] [] dma_alloc_attrs+0x4c/0xb0 > [ 12.142808] [] rtl8169_update_counters+0x7e/0x150 > [r8169] > [ 12.150061] [] rtl8169_get_stats64+0xcb/0x130 [r8169] > [ 12.156956] [] dev_get_stats+0x38/0x90 > [ 12.162511] [] dev_seq_printf_stats+0x23/0x100 > [ 12.168786] [] ? create_object+0x21e/0x2c0 > [ 12.174715] [] dev_seq_show+0xf/0x30 > [ 12.180098] [] seq_read+0x26a/0x400 > [ 12.185384] [] proc_reg_read+0x3e/0x70 > [ 12.190943] [] __vfs_read+0x2f/0xe0 > [ 12.196245] [] ? security_file_permission+0xa2/0xb0 > [ 12.202972] [] ? rw_verify_area+0x58/0xe0 > [ 12.208799] [] vfs_read+0x92/0xd0 > [ 12.213908] [] ? __fdget+0xe/0x10 > [ 12.219024] [] SyS_read+0x51/0xb0 > [ 12.224140] [] entry_SYSCALL_64_fastpath+0x12/0x71 > done. > > If I revert 6e85d5ad36a26debc23a9a865c029cbe242b2dc8 "r8169: Add values > missing in @get_stats64 from HW counters" > I don't get this message. > > Thank you! >> >> tst038 Description: Binary data
Re: [GIT] Networking
> On Sep 7, 2015, at 4:02 AM, David Laightwrote: > > Feed: > int bar(int (*f)[10]) { return sizeof *f; } > into cc -O2 -S and look at the generated code - returns 40 not 4. Yes, indeed it does. And with clang too. I guess I was too easily discouraged when looking for a workable syntax some years ago. At the time I stopped when the typedef worked, which really just encapsulates this. I should have recognized that then. Thanks for making it all so clear. -- Mark Rustad, Networking Division, Intel Corporation signature.asc Description: Message signed with OpenPGP using GPGMail
RE: [GIT] Networking
From: Rustad, Mark D ... > >> static int smp_ah(struct crypto_blkcipher *tfm, const u8 irk[16], > >> const u8 r[3], u8 res[3]) > > > > Expect that it looks like you are passing arrays by value, > > but instead you are passing by reference. > > > > Explicitly pass by reference and sizeof works. > > It depends on what you mean by works. It at least doesn't look so misleading > when passing by reference > and so works more as expected. The sizeof in either case will never return > the size of the array. To > have sizeof return the size of the array would require a typedef of the array > to pass by reference. In > some cases that could be the right thing to do. Feed: int bar(int (*f)[10]) { return sizeof *f; } into cc -O2 -S and look at the generated code - returns 40 not 4. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html