Re: [RFC 2/7] ath10k: Add support to process rx packet in thread
On 3/22/21 6:20 PM, Brian Norris wrote: On Mon, Mar 22, 2021 at 4:58 PM Ben Greear wrote: On 7/22/20 6:00 AM, Felix Fietkau wrote: On 2020-07-22 14:55, Johannes Berg wrote: On Wed, 2020-07-22 at 14:27 +0200, Felix Fietkau wrote: I'm considering testing a different approach (with mt76 initially): - Add a mac80211 rx function that puts processed skbs into a list instead of handing them to the network stack directly. Would this be *after* all the mac80211 processing, i.e. in place of the rx-up-to-stack? Yes, it would run all the rx handlers normally and then put the resulting skbs into a list instead of calling netif_receive_skb or napi_gro_frags. Whatever came of this? I realized I'm running Felix's patch since his mt76 driver needs it. Any chance it will go upstream? If you're asking about $subject (moving NAPI/RX to a thread), this landed upstream recently: http://git.kernel.org/linus/adbb4fb028452b1b0488a1a7b66ab856cdf20715 It needs a bit of coaxing to work on a WiFi driver (including: WiFi drivers tend to have a different netdev for NAPI than they expose to /sys/class/net/), but it's there. I'm not sure if people had something else in mind in the stuff you're quoting though. No, I got it confused with something Felix did: https://github.com/greearb/mt76/blob/master/patches/0001-net-add-support-for-threaded-NAPI-polling.patch Maybe the NAPI/RX to a thread thing superceded Felix's patch? Thanks, Ben Brian -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [RFC 2/7] ath10k: Add support to process rx packet in thread
On 7/22/20 6:00 AM, Felix Fietkau wrote: On 2020-07-22 14:55, Johannes Berg wrote: On Wed, 2020-07-22 at 14:27 +0200, Felix Fietkau wrote: I'm considering testing a different approach (with mt76 initially): - Add a mac80211 rx function that puts processed skbs into a list instead of handing them to the network stack directly. Would this be *after* all the mac80211 processing, i.e. in place of the rx-up-to-stack? Yes, it would run all the rx handlers normally and then put the resulting skbs into a list instead of calling netif_receive_skb or napi_gro_frags. Whatever came of this? I realized I'm running Felix's patch since his mt76 driver needs it. Any chance it will go upstream? Thanks, Ben - Felix -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 0/3] mac80211: Trigger disconnect for STA during recovery
On 12/17/20 2:24 PM, Brian Norris wrote: On Tue, Dec 15, 2020 at 10:23:33AM -0800, Ben Greear wrote: On 12/15/20 9:21 AM, Youghandhar Chintala wrote: From: Rakesh Pillai Currently in case of target hardware restart ,we just reconfig and re-enable the security keys and enable the network queues to start data traffic back from where it was interrupted. Are there any known mac80211 radios/drivers that *can* support seamless restarts? If not, then just could always enable this feature in mac80211? I'm quite sure that iwlwifi intentionally supports a seamless restart. From my experience with dealing with user reports, I don't recall any issues where restart didn't function as expected, unless there was some deeper underlying failure (e.g., hardware/power failure; driver bugs / lockups). I don't have very good stats for ath10k/QCA6174, but it survives our testing OK and I again don't recall any user-reported complaints in this area. I'd say this is a weaker example though, as I don't have as clear of data. (By contrast, ath10k/WCN399x, which Rakesh, et al, are patching here, does not pass our tests at all, and clearly fails to recover from "seamless" restarts, as noted in patch 3.) I'd also note that we don't operate in AP mode -- only STA -- and IIRC Ben, you've complained about AP mode in the past. I complain about all sorts of things, but I'm usually running station mode :) Do you actually see iwlwifi stations stay associated through firmware crashes? Anyway, happy to hear some have seamless recovery, and in that case, I have no objections to the patch. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 0/3] mac80211: Trigger disconnect for STA during recovery
On 12/15/20 9:21 AM, Youghandhar Chintala wrote: From: Rakesh Pillai Currently in case of target hardware restart ,we just reconfig and re-enable the security keys and enable the network queues to start data traffic back from where it was interrupted. Are there any known mac80211 radios/drivers that *can* support seamless restarts? If not, then just could always enable this feature in mac80211? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
BUG: Invalid wait context, related to serial console?
I saw this booting yesterday's 5.9.0-rc6+. I have one small patch applied, quite unlikely to cause any problems I think. Platform is apu2, fedora-27 linux, with serial console attached. [0.624831] = [0.624831] [ BUG: Invalid wait context ] [0.624831] 5.9.0-rc6+ #1 Not tainted [0.624831] - [0.624831] swapper/1/0 is trying to lock: [0.624831] 85523218 (_lock_key){}-{3:3}, at: serial8250_console_write+0xfd/0x520 [0.624831] other info that might help us debug this: [0.624831] context-{2:2} [0.624831] 2 locks held by swapper/1/0: [0.624831] #0: 83338a60 (console_lock){+.+.}-{0:0}, at: printk+0xad/0xde [0.624831] #1: 83338820 (console_owner){-...}-{0:0}, at: console_unlock+0x291/0x840 [0.624831] stack backtrace: [0.624831] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc6+ #1 [0.624831] Hardware name: PC Engines APU2/APU2, BIOS 4.0.7 02/28/2017 [0.624831] Call Trace: [0.624831] dump_stack+0xae/0xe8 [0.624831] __lock_acquire.cold+0x1b9/0x34d [0.624831] ? save_trace+0x345/0x440 [0.624831] ? lockdep_hardirqs_on_prepare+0x260/0x260 [0.624831] ? mark_lock+0x90/0xb00 [0.624831] lock_acquire+0x14f/0x5f0 [0.624831] ? serial8250_console_write+0xfd/0x520 [0.624831] ? lock_release+0x440/0x440 [0.624831] ? lockdep_hardirqs_on_prepare+0x260/0x260 [0.624831] ? __lock_acquire+0x85c/0x2f10 [0.624831] _raw_spin_lock_irqsave+0x43/0x60 [0.624831] ? serial8250_console_write+0xfd/0x520 [0.624831] serial8250_console_write+0xfd/0x520 [0.624831] ? serial8250_config_port+0x1630/0x1630 [0.624831] ? lock_downgrade+0x3a0/0x3a0 [0.624831] ? do_raw_spin_lock+0x114/0x1a0 [0.624831] ? rwlock_bug.part.0+0x50/0x50 [0.624831] console_unlock+0x582/0x840 [0.624831] ? devkmsg_open+0x170/0x170 [0.624831] ? do_raw_spin_unlock+0x8e/0xe0 [0.624831] ? printk+0xad/0xde [0.624831] vprintk_emit+0x165/0x3e0 [0.624831] printk+0xad/0xde [0.624831] ? kmsg_dump_rewind_nolock+0x54/0x54 [0.624831] ? lock_acquire+0x14f/0x5f0 [0.624831] ? do_raw_spin_unlock+0x8e/0xe0 [0.624831] ? do_raw_spin_unlock+0x7f/0xe0 [0.624831] common_interrupt+0x16c/0x180 [0.624831] asm_common_interrupt+0x1e/0x40 [0.624831] RIP: 0010:start_secondary+0x1b6/0x230 [0.624831] Code: 65 8b 3d 1d 97 f7 7e e8 48 f0 14 00 48 c7 c7 d0 d4 04 83 e8 4c 15 45 00 ff 15 76 d8 fa 01 e8 31 6f 2b 00 fb 66 0f 1f 44 00 00 <48> c7 c7 40 33 29 83 e8 2e 15 45 00 ff 15 c8 36 1f 02 0f ae f8 bf [0.624831] RSP: :8880cda3fed8 EFLAGS: 0202 [0.624831] RAX: 0001 RBX: 0001 RCX: dc00 [0.624831] RDX: RSI: RDI: 8109fc5f [0.624831] RBP: 111019b47fdb R08: 0001 R09: 0001 [0.624831] R10: ed101a494179 R11: 0001 R12: 000f3aac [0.624831] R13: R14: R15: [0.624831] ? start_secondary+0x1af/0x230 [0.624831] ? start_secondary+0x1af/0x230 [0.624831] ? set_cpu_sibling_map+0xc70/0xc70 [0.624831] secondary_startup_64+0xa4/0xb0 [2.531065] #2 [0.624831] __common_interrupt: 2.55 No irq handler for vector [2.542389] #3 [0.624831] __common_interrupt: 3.55 No irq handler for vector Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Build failure for today's tree, Fedora-32
On 9/21/20 12:50 PM, Ben Greear wrote: Hello, I'm seeing this build failure, any idea what is the issue? I've cleared my ccache (ccache --clear) but that did not help. A pull from this morning builds on Fedora-29 with same .config file. I tried that same commit on my F32 system to no success, so it seems like it might be something with my F32 compiler/system. Looks like my git tree was corrupted. A fresh one is working better. Thanks, Ben
Build failure for today's tree, Fedora-32
5:5: error: conflicting types for ‘copy_siginfo_to_user’ 35 | int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t *from); | ^~~~ In file included from /home/greearb/git/linux-linus/arch/x86/include/uapi/asm/siginfo.h:13, from /home/greearb/git/linux-linus/include/uapi/linux/signal.h:6, from /home/greearb/git/linux-linus/include/linux/signal_types.h:10, from /home/greearb/git/linux-linus/include/linux/sched.h:30, from /home/greearb/git/linux-linus/include/linux/uaccess.h:6, from /home/greearb/git/linux-linus/arch/x86/include/asm/fpu/xstate.h:5, from /home/greearb/git/linux-linus/arch/x86/include/asm/pgtable.h:26, from /home/greearb/git/linux-linus/include/linux/pgtable.h:6, from /home/greearb/git/linux-linus/include/linux/kasan.h:14, from /home/greearb/git/linux-linus/include/linux/slab.h:136, from /home/greearb/git/linux-linus/include/linux/crypto.h:20, from /home/greearb/git/linux-linus/arch/x86/kernel/asm-offsets.c:9: /home/greearb/git/linux-linus/include/asm-generic/siginfo.h:35:12: note: previous declaration of ‘copy_siginfo_to_user’ was here 35 | extern int copy_siginfo_to_user(struct siginfo __user *to, const struct siginfo *from); |^~~~ make[2]: *** [/home/greearb/git/linux-linus/scripts/Makefile.build:117: arch/x86/kernel/asm-offsets.s] Error 1 make[1]: *** [/home/greearb/git/linux-linus/Makefile:1198: prepare0] Error 2 make: *** [/home/greearb/git/linux-linus/Makefile:185: __sub-make] Error 2 Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH v2 1/3] ath10k: Add history for tracking certain events
On 7/31/20 11:27 AM, Rakesh Pillai wrote: Add history for tracking the below events - register read - register write - IRQ trigger - NAPI poll - CE service - WMI cmd - WMI event - WMI tx completion This will help in debugging any crash or any improper behaviour. Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1 Signed-off-by: Rakesh Pillai --- drivers/net/wireless/ath/ath10k/ce.c | 1 + drivers/net/wireless/ath/ath10k/core.h| 74 + drivers/net/wireless/ath/ath10k/debug.c | 133 ++ drivers/net/wireless/ath/ath10k/debug.h | 74 + drivers/net/wireless/ath/ath10k/snoc.c| 15 +++- drivers/net/wireless/ath/ath10k/wmi-tlv.c | 1 + drivers/net/wireless/ath/ath10k/wmi.c | 10 +++ 7 files changed, 307 insertions(+), 1 deletion(-) +void ath10k_record_wmi_event(struct ath10k *ar, enum ath10k_wmi_type type, +u32 id, unsigned char *data) +{ + struct ath10k_wmi_event_entry *entry; + u32 idx; + + if (type == ATH10K_WMI_EVENT) { + if (!ar->wmi_event_history.record) + return; This check above is duplicated below, add it once at top of the method instead. + + spin_lock_bh(>wmi_event_history.hist_lock); + idx = ath10k_core_get_next_idx(>reg_access_history.index, + ar->wmi_event_history.max_entries); + spin_unlock_bh(>wmi_event_history.hist_lock); + entry = >wmi_event_history.record[idx]; + } else { + if (!ar->wmi_cmd_history.record) + return; + + spin_lock_bh(>wmi_cmd_history.hist_lock); + idx = ath10k_core_get_next_idx(>reg_access_history.index, + ar->wmi_cmd_history.max_entries); + spin_unlock_bh(>wmi_cmd_history.hist_lock); + entry = >wmi_cmd_history.record[idx]; + } + + entry->timestamp = ath10k_core_get_timestamp(); + entry->cpu_id = smp_processor_id(); + entry->type = type; + entry->id = id; + memcpy(>data, data + 4, ATH10K_WMI_DATA_LEN); +} +EXPORT_SYMBOL(ath10k_record_wmi_event); @@ -1660,6 +1668,11 @@ static int ath10k_snoc_probe(struct platform_device *pdev) ar->ce_priv = _snoc->ce; msa_size = drv_data->msa_size; + ath10k_core_reg_access_history_init(ar, ATH10K_REG_ACCESS_HISTORY_MAX); + ath10k_core_wmi_event_history_init(ar, ATH10K_WMI_EVENT_HISTORY_MAX); + ath10k_core_wmi_cmd_history_init(ar, ATH10K_WMI_CMD_HISTORY_MAX); + ath10k_core_ce_event_history_init(ar, ATH10K_CE_EVENT_HISTORY_MAX); Maybe only enable this once user turns it on? It sucks up a bit of memory? + ath10k_snoc_quirks_init(ar); ret = ath10k_snoc_resource_init(ar); diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c b/drivers/net/wireless/ath/ath10k/wmi-tlv.c index 932266d..9df5748 100644 --- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c @@ -627,6 +627,7 @@ static void ath10k_wmi_tlv_op_rx(struct ath10k *ar, struct sk_buff *skb) if (skb_pull(skb, sizeof(struct wmi_cmd_hdr)) == NULL) goto out; + ath10k_record_wmi_event(ar, ATH10K_WMI_EVENT, id, skb->data); trace_ath10k_wmi_event(ar, id, skb->data, skb->len); consumed = ath10k_tm_event_wmi(ar, id, skb); diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index a81a1ab..8ebd05c 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -1802,6 +1802,15 @@ struct sk_buff *ath10k_wmi_alloc_skb(struct ath10k *ar, u32 len) static void ath10k_wmi_htc_tx_complete(struct ath10k *ar, struct sk_buff *skb) { + struct wmi_cmd_hdr *cmd_hdr; + enum wmi_tlv_event_id id; + + cmd_hdr = (struct wmi_cmd_hdr *)skb->data; + id = MS(__le32_to_cpu(cmd_hdr->cmd_id), WMI_CMD_HDR_CMD_ID); + + ath10k_record_wmi_event(ar, ATH10K_WMI_TX_COMPL, id, + skb->data + sizeof(struct wmi_cmd_hdr)); + dev_kfree_skb(skb); } I think guard the above new code with if (unlikely(ar->ce_event_history.record)) { ... } All in all, I think I'd want to compile this out (while leaving other debug compiled in) since it seems this stuff would be rarely used and it adds method calls to hot paths. That is a decision for Kalle though, so see what he says... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] ath10k: Add history for tracking certain events
On 06/27/2020 10:12 PM, Rakesh Pillai wrote: -Original Message- From: Ben Greear Sent: Saturday, June 27, 2020 8:58 PM To: Rakesh Pillai ; ath...@lists.infradead.org Cc: linux-wirel...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] ath10k: Add history for tracking certain events On 06/26/2020 11:22 PM, Rakesh Pillai wrote: For debugging many issues, a history of the below mentioned events can help get an idea of what exactly was going on just before any issue occurred in the system. These event history will be collected only when the host driver is run in debug mode (i.e. with the config ATH10K_DEBUG enabled). This should be disabled by default unless user specifically pokes some debugfs value to turn it on so that it does not impact performance. Hi Ben, This history is enabled only if the user compiles the kernel with ATH10K_DEBUG. Making it runtime, adds a lot of "if" conditions for this history record. Do you suggest to add support to enable/disable it runtime even in ATH10K_DEBUG ? Yes, because you are adding lots of locks/unlocks. That is way more expensive than an if statement. You can add an 'unlikely' to the if check as well, so compiler will optimize for this feature not being enabled. Thanks, Ben Thanks, Ben Add history for tracking the below events - register read - register write - IRQ trigger - IRQ Enable - IRQ Disable - NAPI poll - CE service - WMI cmd - WMI event - WMI tx completion This will help in debugging any crash or any improper behaviour. -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] ath10k: Add history for tracking certain events
On 06/26/2020 11:22 PM, Rakesh Pillai wrote: For debugging many issues, a history of the below mentioned events can help get an idea of what exactly was going on just before any issue occurred in the system. These event history will be collected only when the host driver is run in debug mode (i.e. with the config ATH10K_DEBUG enabled). This should be disabled by default unless user specifically pokes some debugfs value to turn it on so that it does not impact performance. Thanks, Ben Add history for tracking the below events - register read - register write - IRQ trigger - IRQ Enable - IRQ Disable - NAPI poll - CE service - WMI cmd - WMI event - WMI tx completion This will help in debugging any crash or any improper behaviour. -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 2/2] ath10k: Skip wait for delete response if firmware is down
On 06/26/2020 11:11 AM, Rakesh Pillai wrote: Currently the driver waits for response from the firmware for all the delete cmds, eg: vdev_delete, peer delete. If the firmware is down, these wait will always timeout and return an error. Also during subsytems recovery, any attempt to send a WMI cmd to the FW will return the -ESHUTDOWN status, which when returned to mac80211, can cause unnecessary warnings to be printed on to the console, as shown below [ 2559.529565] Call trace: [ 2559.532214] __sta_info_destroy_part2+0x160/0x168 [mac80211] [ 2559.538157] __sta_info_flush+0x124/0x180 [mac80211] [ 2559.543402] ieee80211_set_disassoc+0x130/0x2c0 [mac80211] [ 2559.549172] ieee80211_mgd_deauth+0x238/0x25c [mac80211] [ 2559.554764] ieee80211_deauth+0x24/0x30 [mac80211] [ 2559.559860] cfg80211_mlme_deauth+0x258/0x2b0 [cfg80211] [ 2559.565446] nl80211_deauthenticate+0xe4/0x110 [cfg80211] [ 2559.571064] genl_rcv_msg+0x3a0/0x440 [ 2559.574888] netlink_rcv_skb+0xb4/0x11c [ 2559.578877] genl_rcv+0x34/0x48 [ 2559.582162] netlink_unicast+0x14c/0x1e4 [ 2559.586235] netlink_sendmsg+0x2f0/0x360 [ 2559.590317] sock_sendmsg+0x44/0x5c [ 2559.593951] sys_sendmsg+0x1c8/0x290 [ 2559.598029] ___sys_sendmsg+0xa8/0xfc [ 2559.601840] __sys_sendmsg+0x8c/0xd0 [ 2559.605572] __arm64_compat_sys_sendmsg+0x2c/0x38 [ 2559.610468] el0_svc_common+0xa8/0x160 [ 2559.614372] el0_svc_compat_handler+0x2c/0x38 [ 2559.618905] el0_svc_compat+0x8/0x10 Skip the wait for delete response from the firmware if the firmware is down. Also return success to the mac80211 calls when the peer delete cmd fails with return status -ESHUTDOWN. Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1 Signed-off-by: Rakesh Pillai --- drivers/net/wireless/ath/ath10k/mac.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index dc7befc..7ac6549 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -701,7 +701,8 @@ static void ath10k_wait_for_peer_delete_done(struct ath10k *ar, u32 vdev_id, unsigned long time_left; int ret; - if (test_bit(WMI_SERVICE_SYNC_DELETE_CMDS, ar->wmi.svc_map)) { + if (test_bit(WMI_SERVICE_SYNC_DELETE_CMDS, ar->wmi.svc_map) && + test_bit(ATH10K_FLAG_CRASH_FLUSH, >dev_flags)) { Don't you mean !test_bit(ATH10K_FLAG_CRASH_FLUSH, >dev_flags)) ??? Or maybe I'm just mis-reading your patch? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH v3 0/8] kernel: taint when the driver firmware crashes
On 05/28/2020 07:27 AM, Luis Chamberlain wrote: On Wed, May 27, 2020 at 02:36:42PM -0700, Jakub Kicinski wrote: On Wed, 27 May 2020 03:19:18 + Luis Chamberlain wrote: I read your patch, and granted, I will accept I was under the incorrect assumption that this can only be used by networking devices, however it the devlink approach achieves getting userspace the ability with iproute2 devlink util to query a device health, on to which we can peg firmware health. But *this* patch series is not about health status and letting users query it, its about a *critical* situation which has come up with firmware requiring me to reboot my system, and the lack of *any* infrastructure in the kernel today to inform userspace about it. So say we use netlink to report a critical health situation, how are we informing userspace with your patch series about requring a reboot? One of main features of netlink is pub/sub model of notifications. Whatever you imagine listening to your uevent can listen to devlink-health notifications via devlink. In fact I've shown this off in the RFC patches I sent to you, see the devlink mon health command being used. Yes but I looked at iputils2 devlink and seems I made an incorrect assumption this can only be used for a network device rather than a struct device. I'll take a second look. Hello Jakub, I'm thinking about something similar to what Luis is proposing, but in my case I'd like to report just when the driver knows the hardware is gone and cannot be recovered, like when this is reported: [ 2548.851832] WARNING: CPU: 3 PID: 98 at backports-4.19.98-1/net/mac80211/util.c:2040 ieee80211_reconfig+0x98/0xb64 [mac80211] [ 2548.856020] Hardware became unavailable during restart. I'd like to be able to tie this into a watch-dog program to allow automatic reboot of the system soon after this event is seen, for instance. Could you post your devlink RFC patches somewhere public? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [RFC 1/2] devlink: add simple fw crash helpers
On 05/25/2020 02:07 AM, Andy Shevchenko wrote: On Fri, May 22, 2020 at 04:23:55PM -0700, Steve deRosier wrote: On Fri, May 22, 2020 at 2:51 PM Luis Chamberlain wrote: I had to go RTFM re: kernel taints because it has been a very long time since I looked at them. It had always seemed to me that most were caused by "kernel-unfriendly" user actions. The most famous of course is loading proprietary modules, out-of-tree modules, forced module loads, etc... Honestly, I had forgotten the large variety of uses of the taint flags. For anyone who hasn't looked at taints recently, I recommend: https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html In light of this I don't object to setting a taint on this anymore. I'm a little uneasy, but I've softened on it now, and now I feel it depends on implementation. Specifically, I don't think we should set a taint flag when a driver easily handles a routine firmware crash and is confident that things have come up just fine again. In other words, triggering the taint in every driver module where it spits out a log comment that it had a firmware crash and had to recover seems too much. Sure, firmware shouldn't crash, sure it should be open source so we can fix it, whatever... While it may sound idealistic the firmware for the end-user, and even for mere kernel developer like me, is a complete blackbox which has more access than root user in the kernel. We have tons of firmwares and each of them potentially dangerous beast. As a user I really care about my data and privacy (hacker can oops a firmware in order to set a specific vector attack). So, tainting kernel is _a least_ we can do there, the strict rules would be to reboot immediately. those sort of wishful comments simply ignore reality and our ability to affect effective change. We can encourage users not to buy cheap crap for the starter. There is no stable wifi firmware for any price. There is also no obvious feedback from even name-brand NICs like ath10k or AX200 when you report a crash. That said, at least in my experience with ath10k-ct, the OS normally recovers fine from firmware crashes. ath10k already reports full crash reports on udev, so easy for user-space to notice and report bug reports upstream if it cares to. Probably other NICs do the same, and if not, they certainly could. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On 05/18/2020 10:09 AM, Luis Chamberlain wrote: On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote: On 05/18/2020 09:51 AM, Luis Chamberlain wrote: On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> module_firmware_crashed You didn't CC me or the wireless list on the rest of the patches, so I'm replying to a random one, but ... What is the point here? This should in no way affect the integrity of the system/kernel, for most devices anyway. Keyword you used here is "most device". And in the worst case, *who* knows what other odd things may happen afterwards. So what if ath10k's firmware crashes? If there's a driver bug it will not handle it right (and probably crash, WARN_ON, or something else), but if the driver is working right then that will not affect the kernel at all. Sometimes the device can go into a state which requires driver removal and addition to get things back up. It would be lovely to be able to detect this case in the driver/system somehow! I haven't seen any such cases recently, I assure you that I have run into it. Once it does again I'll report the crash, but the problem with some of this is that unless you scrape the log you won't know. Eventually, a uevent would indeed tell inform me. but in case there is some common case you see, maybe we can think of a way to detect it? ath10k is just one case, this patch series addresses a simple way to annotate this tree-wide. So maybe I can understand that maybe you want an easy way to discover - per device - that the firmware crashed, but that still doesn't warrant a complete kernel taint. That is one reason, another is that a taint helps support cases *fast* easily detect if the issue was a firmware crash, instead of scraping logs for driver specific ways to say the firmware has crashed. You can listen for udev events (I think that is the right term), and find crashes that way. You get the actual crash info as well. My follow up to this was to add uevent to add_taint() as well, this way these could generically be processed by userspace. I'm not opposed to the taint, though I have not thought much on it. But, if you can already get the crash info from uevent, and it automatically comes without polling or scraping logs, then what benefit beyond that does the taint give you? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On 05/18/2020 09:51 AM, Luis Chamberlain wrote: On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> module_firmware_crashed You didn't CC me or the wireless list on the rest of the patches, so I'm replying to a random one, but ... What is the point here? This should in no way affect the integrity of the system/kernel, for most devices anyway. Keyword you used here is "most device". And in the worst case, *who* knows what other odd things may happen afterwards. So what if ath10k's firmware crashes? If there's a driver bug it will not handle it right (and probably crash, WARN_ON, or something else), but if the driver is working right then that will not affect the kernel at all. Sometimes the device can go into a state which requires driver removal and addition to get things back up. It would be lovely to be able to detect this case in the driver/system somehow! I haven't seen any such cases recently, but in case there is some common case you see, maybe we can think of a way to detect it? So maybe I can understand that maybe you want an easy way to discover - per device - that the firmware crashed, but that still doesn't warrant a complete kernel taint. That is one reason, another is that a taint helps support cases *fast* easily detect if the issue was a firmware crash, instead of scraping logs for driver specific ways to say the firmware has crashed. You can listen for udev events (I think that is the right term), and find crashes that way. You get the actual crash info as well. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] ath10k: increase rx buffer size to 2048
On 04/28/2020 05:01 AM, Kalle Valo wrote: Sven Eckelmann writes: On Wednesday, 1 April 2020 09:00:49 CEST Sven Eckelmann wrote: On Wednesday, 5 February 2020 20:10:43 CEST Linus Lüssing wrote: From: Linus Lüssing Before, only frames with a maximum size of 1528 bytes could be transmitted between two 802.11s nodes. For batman-adv for instance, which adds its own header to each frame, we typically need an MTU of at least 1532 bytes to be able to transmit without fragmentation. This patch now increases the maxmimum frame size from 1528 to 1656 bytes. [...] @Kalle, I saw that this patch was marked as deferred [1] but I couldn't find any mail why it was done so. It seems like this currently creates real world problems - so would be nice if you could explain shortly what is currently blocking its acceptance. Ping? Sorry for the delay, my plan was to first write some documentation about different hardware families but haven't managed to do that yet. My problem with this patch is that I don't know what hardware and firmware versions were tested, so it needs analysis before I feel safe to apply it. The ath10k hardware families are very different that even if a patch works perfectly on one ath10k hardware it could still break badly on another one. What makes me faster to apply ath10k patches is to have comprehensive analysis in the commit log. This shows me the patch author has considered about all hardware families, not just the one he is testing on, and that I don't need to do the analysis myself. It has been in ath10k-ct for a while, and that has some fairly wide coverage in OpenWrt, so likely if there were problems we would have seen it already. I did not make any specific changes to firmware to support this, so upstream firmware should behave similarly. Seems like upstream ath10k could really benefit from having some test beds so you can actually test code on different chips and have confidence in your changes! Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
On 09/11/2019 06:21 AM, Linus Torvalds wrote: On Wed, Sep 11, 2019 at 2:03 PM Ben Greear wrote: Out of curiosity, I'm interested to know what ath10k NIC chipset this is from. It's a Dell XPS 13 9380, with 02:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32) Subsystem: Bigfoot Networks, Inc. Killer 1435 Wireless-AC (numeric PCI ID 168c:003e, subsystem 1a56:143a). The ath10k driver says qca6174 hw3.2 target 0x0503 chip_id 0x00340aff sub 1a56:143a firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1 api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1 board_file api 2 bmi_id N/A crc32 4ed3569e if that tells you anything more. That means it is something I have never used nor have firmware for, but the WMI logic should be similar to what I described and have experienced with other chips. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
On 09/11/2019 05:04 AM, Johannes Berg wrote: On Wed, 2019-09-11 at 12:58 +0100, Linus Torvalds wrote: And I didn't think about it or double-check, because the errors that then followed later _looked_ like that TX power failing that I thought hadn't happened. Yeah, it could be something already got stuck there, hard to say. Since we see that something actually did an rfkill operation. Did you push a button there? No, I tried to turn off and turn on Wifi manually (no button, just the settings panel). That does usually also cause rfkill, so that explains how we got down this particular code path. I didn't notice the WARN_ON(), I just noticed that there was no networking, and "turn it off and on again" is obviously the first thing to try ;) :-) Sep 11 10:27:13 xps13 kernel: WARNING: CPU: 4 PID: 1246 at net/mac80211/sta_info.c:1057 __sta_info_destroy_part2+0x147/0x150 [mac80211] but if you want full logs I can send them in private to you. No, it's fine, though maybe Kalle does - he was stepping out for a while but said he'd look later. This is the interesting time - 10:27:13 we get one of the first failures. Really the first one was this: Sep 11 10:27:07 xps13 kernel: ath10k_pci :02:00.0: wmi command 16387 timeout, restarting hardware I do suspect it's atheros and suspend/resume or something. The wireless clearly worked for a while after the resume, but then at some point it stopped. I'm not really sure it's related to suspend/resume at all, the firmware seems to just have gotten stuck, and the device and firmware most likely got reset over the suspend/resume anyway. The only explanation I therefore have is that something is just taking *forever* in that code path, hence my question about timing information on the logs. Yeah, maybe it would time out everything eventually. But not for a long time. It hadn't cleared up by Sep 11 10:36:21 xps13 gnome-session-f[6837]: gnome-session-failed: Fatal IO error 0 (Success) on X server :0. Ok, that's way longer than I would have guessed even! That's over 9 minutes, that'd be close to 200 commands having to be issued and timing out ... I don't know. What I wrote before is basically all I can say, I think the driver gets stuck somewhere waiting for the device "forever", and the stack just doesn't get to release the lock, causing all the follow- up problems. It looks to me like the ath10k firmware is not responding to commands and/or is out of its WMI tx credits. The code often takes a lock and then blocks for up to 3 or so seconds waiting for a response from the firmware, and the mac80211 calling code is often already holding rtnl. Pretty much every mac80211 call will cause a WMI message and thus potentially hit this timeout. This can easily cause rtnl to be held for 3 seconds, but after that, I believe upstream ath10k will now time out and kill the firmware and restart. (I run a significantly modified ath10k driver, and that is how mine works, at least.) In this case, it looks like restarting the firmware/NIC failed, and I guess that must get it in a state where it is still blocking and trying to talk to the firmware? Or maybe deadlock down inside ath10k driver. For what it's worth, we see that WARN_ON often when ath10k firmware crashes, but it seems to not be a big deal and the system normally recovers fine. Out of curiosity, I'm interested to know what ath10k NIC chipset this is from. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations
On 1/7/19 9:57 AM, Tim Chen wrote: On 12/31/18 8:22 AM, Ben Greear wrote: On 12/21/2018 05:17 PM, Tim Chen wrote: If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document. There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance. The performance penalty comes from retpoline penalizing indirect branch predictions in kernel. With nospectre_v2, retpoline is disabled so you should get all the performance back from spectre mitigation. This does not disable kernel page table isolation for meltdown mitigation, which also needs to be turned off if you want to get the full performance back. That's somewhat beyond the scope of this doc on Spectre. The two bug families (spectre and meltdown) are conflated in my mind, at least. For those of us who do not really understand this stuff in detail, it would be good to at least mention some notes about Meltdown I think. And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values... If you just want to disable spectre mitigation, setting CONFIG_RETPOLINE=n should do the trick. If you also want to disable meltdown mitigation, set CONFIG_PAGE_TABLE_ISOLATION=n. Ok, are there any other CONFIG options that relate to fixing security bugs that have noticeable performance impacts or are these two the complete list? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations
On 12/21/2018 05:17 PM, Tim Chen wrote: On 12/21/18 1:59 PM, Ben Greear wrote: On 12/21/18 9:44 AM, Tim Chen wrote: Thomas, Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look. Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security? If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document. There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance. And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations
On 12/21/18 9:44 AM, Tim Chen wrote: Thomas, Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look. Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Oops on the system startup in the function part_in_flight()
Hello, I am trying to bisect a pktgen related performance regression that appears to be between the 4.13 and 4.14 kernels. But, my system keeps crashing due to part_in_flight oops so bisecting is not going well. It looks like this oops was fixed, but the link to the patch in the email is no longer valid. Can someone let me know what patch fixes this crash so I can apply it while bisecting? https://www.spinics.net/lists/linux-block/msg17809.html Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Oops on the system startup in the function part_in_flight()
Hello, I am trying to bisect a pktgen related performance regression that appears to be between the 4.13 and 4.14 kernels. But, my system keeps crashing due to part_in_flight oops so bisecting is not going well. It looks like this oops was fixed, but the link to the patch in the email is no longer valid. Can someone let me know what patch fixes this crash so I can apply it while bisecting? https://www.spinics.net/lists/linux-block/msg17809.html Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns
On 03/20/2018 09:44 AM, Liran Alon wrote: On 20/03/18 18:24, ebied...@xmission.com wrote: I don't believe the current behavior is a bug. I looked through the history. Basically skb_scrub_packet started out as the scrubbing needed for crossing network namespaces. Then tunnels which needed 90% of the functionality started calling it, with the xnet flag added. Because the tunnels needed to preserve their historic behavior. Then dev_forward_skb started calling skb_scrub_packet. A veth pair is supposed to give the same behavior as a cross-over cable plugged into two local nics. A cross over cable won't preserve things like the skb mark. So I don't see why anyone would expect a veth pair to preserve the mark. I disagree with this argument. I think that a skb crossing netns is what simulates a real packet crossing physical computers. Following your argument, why would skb->mark should be preserved when crossing netdevs on same netns via routing? But this does today preserve skb->mark. Therefore, I do think that skb->mark should conceptually only be scrubbed when crossing netns. Regardless of the netdev used to cross it. It should be scrubbed in VETH as well. That is one way to make virtual routers. Possibly the newer VRF features will give another better way to do it, but you should not break things that used to work. Now, if you want to add a new feature that allows one to configure the kernel (or VETH) for a new behavior, then that might be something to consider. Right now I don't see the point of handling packets that don't cross network namespace boundaries specially, other than to preserve backwards compatibility. Well, backwards compat is a big deal all by itself! Thanks, Ben Eric -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns
On 03/20/2018 09:44 AM, Liran Alon wrote: On 20/03/18 18:24, ebied...@xmission.com wrote: I don't believe the current behavior is a bug. I looked through the history. Basically skb_scrub_packet started out as the scrubbing needed for crossing network namespaces. Then tunnels which needed 90% of the functionality started calling it, with the xnet flag added. Because the tunnels needed to preserve their historic behavior. Then dev_forward_skb started calling skb_scrub_packet. A veth pair is supposed to give the same behavior as a cross-over cable plugged into two local nics. A cross over cable won't preserve things like the skb mark. So I don't see why anyone would expect a veth pair to preserve the mark. I disagree with this argument. I think that a skb crossing netns is what simulates a real packet crossing physical computers. Following your argument, why would skb->mark should be preserved when crossing netdevs on same netns via routing? But this does today preserve skb->mark. Therefore, I do think that skb->mark should conceptually only be scrubbed when crossing netns. Regardless of the netdev used to cross it. It should be scrubbed in VETH as well. That is one way to make virtual routers. Possibly the newer VRF features will give another better way to do it, but you should not break things that used to work. Now, if you want to add a new feature that allows one to configure the kernel (or VETH) for a new behavior, then that might be something to consider. Right now I don't see the point of handling packets that don't cross network namespace boundaries specially, other than to preserve backwards compatibility. Well, backwards compat is a big deal all by itself! Thanks, Ben Eric -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH net] virtio-net: disable NAPI only when enabled during XDP set
On 02/28/2018 09:22 AM, David Miller wrote: From: Jason Wang <jasow...@redhat.com> Date: Wed, 28 Feb 2018 18:20:04 +0800 We try to disable NAPI to prevent a single XDP TX queue being used by multiple cpus. But we don't check if device is up (NAPI is enabled), this could result stall because of infinite wait in napi_disable(). Fixing this by checking device state through netif_running() before. Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set") Signed-off-by: Jason Wang <jasow...@redhat.com> Yes, mis-paired NAPI enable/disable are really a pain. Probably, we can do something in the interfaces or mechanisms to make this less error prone and less fragile. Anyways, applied and queued up for -stable, thanks! I just hit a similar bug in ath10k. It seems like napi has plenty of free bit flags so it could keep track of 'is-enabled' state and allow someone to call napi_disable multiple times w/out deadlocking. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH net] virtio-net: disable NAPI only when enabled during XDP set
On 02/28/2018 09:22 AM, David Miller wrote: From: Jason Wang Date: Wed, 28 Feb 2018 18:20:04 +0800 We try to disable NAPI to prevent a single XDP TX queue being used by multiple cpus. But we don't check if device is up (NAPI is enabled), this could result stall because of infinite wait in napi_disable(). Fixing this by checking device state through netif_running() before. Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set") Signed-off-by: Jason Wang Yes, mis-paired NAPI enable/disable are really a pain. Probably, we can do something in the interfaces or mechanisms to make this less error prone and less fragile. Anyways, applied and queued up for -stable, thanks! I just hit a similar bug in ath10k. It seems like napi has plenty of free bit flags so it could keep track of 'is-enabled' state and allow someone to call napi_disable multiple times w/out deadlocking. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()
On 02/09/2017 11:03 PM, Valo, Kalle wrote: Ben Greear <gree...@candelatech.com> writes: On 02/07/2017 01:14 AM, Valo, Kalle wrote: Adrian Chadd <adr...@freebsd.org> writes: Removing this method makes the diff to FreeBSD larger, as "vif" in FreeBSD is a different pointer. (Yes, I have ath10k on freebsd working and I'd like to find a way to reduce the diff moving forward.) I don't like this "(void *) vif->drv_priv" style that much either but apparently it's commonly used in Linux wireless code and already parts of ath10k. So this patch just unifies the coding style. Surely the code compiles to the same thing, so why add a patch that makes it more difficult for Adrian and makes the code no easier to read for the rest of us? Because that's the coding style used already in Linux. It's great to see that parts of ath10k can be used also in other systems but in principle I'm not very fond of the idea starting to reject valid upstream patches because of driver forks. There are lots of people trying to maintain out-of-tree or backported patches to ath10k, and every time there is a meaningless style change, that just makes us waste more time on useless work instead of having time to work on more important matters. Thanks, Ben I think backports project is doing it right, it's not limiting upstream development in any way and handles all the API changes internally. Maybe FreeBSD could do something similar? -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()
On 02/09/2017 11:03 PM, Valo, Kalle wrote: Ben Greear writes: On 02/07/2017 01:14 AM, Valo, Kalle wrote: Adrian Chadd writes: Removing this method makes the diff to FreeBSD larger, as "vif" in FreeBSD is a different pointer. (Yes, I have ath10k on freebsd working and I'd like to find a way to reduce the diff moving forward.) I don't like this "(void *) vif->drv_priv" style that much either but apparently it's commonly used in Linux wireless code and already parts of ath10k. So this patch just unifies the coding style. Surely the code compiles to the same thing, so why add a patch that makes it more difficult for Adrian and makes the code no easier to read for the rest of us? Because that's the coding style used already in Linux. It's great to see that parts of ath10k can be used also in other systems but in principle I'm not very fond of the idea starting to reject valid upstream patches because of driver forks. There are lots of people trying to maintain out-of-tree or backported patches to ath10k, and every time there is a meaningless style change, that just makes us waste more time on useless work instead of having time to work on more important matters. Thanks, Ben I think backports project is doing it right, it's not limiting upstream development in any way and handles all the API changes internally. Maybe FreeBSD could do something similar? -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()
On 02/07/2017 01:14 AM, Valo, Kalle wrote: Adrian Chadd <adr...@freebsd.org> writes: Removing this method makes the diff to FreeBSD larger, as "vif" in FreeBSD is a different pointer. (Yes, I have ath10k on freebsd working and I'd like to find a way to reduce the diff moving forward.) I don't like this "(void *) vif->drv_priv" style that much either but apparently it's commonly used in Linux wireless code and already parts of ath10k. So this patch just unifies the coding style. Surely the code compiles to the same thing, so why add a patch that makes it more difficult for Adrian and makes the code no easier to read for the rest of us? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()
On 02/07/2017 01:14 AM, Valo, Kalle wrote: Adrian Chadd writes: Removing this method makes the diff to FreeBSD larger, as "vif" in FreeBSD is a different pointer. (Yes, I have ath10k on freebsd working and I'd like to find a way to reduce the diff moving forward.) I don't like this "(void *) vif->drv_priv" style that much either but apparently it's commonly used in Linux wireless code and already parts of ath10k. So this patch just unifies the coding style. Surely the code compiles to the same thing, so why add a patch that makes it more difficult for Adrian and makes the code no easier to read for the rest of us? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] crypto: ccm - avoid scatterlist for MAC encryption
On 10/19/2016 08:08 AM, Ard Biesheuvel wrote: On 19 October 2016 at 08:43, Johannes Berg <johan...@sipsolutions.net> wrote: On Wed, 2016-10-19 at 11:31 +0800, Herbert Xu wrote: We could probably make mac80211 do that too, but can we guarantee in- order processing? Anyway, it's pretty low priority, maybe never happening, since hardly anyone really uses "software" crypto, the wifi devices mostly have it built in anyway. Indeed. The code is now correct in terms of API requirements, so let's just wait for someone to complain about any performance regressions. Do you actually expect performance regressions? I'll be complaining if so, but will test first :) Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] crypto: ccm - avoid scatterlist for MAC encryption
On 10/19/2016 08:08 AM, Ard Biesheuvel wrote: On 19 October 2016 at 08:43, Johannes Berg wrote: On Wed, 2016-10-19 at 11:31 +0800, Herbert Xu wrote: We could probably make mac80211 do that too, but can we guarantee in- order processing? Anyway, it's pretty low priority, maybe never happening, since hardly anyone really uses "software" crypto, the wifi devices mostly have it built in anyway. Indeed. The code is now correct in terms of API requirements, so let's just wait for someone to complain about any performance regressions. Do you actually expect performance regressions? I'll be complaining if so, but will test first :) Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Question on kzalloc and GFP_DMA32
On 09/28/2016 02:11 PM, David Rientjes wrote: On Tue, 27 Sep 2016, Ben Greear wrote: I have been running this patch for a while: ath10k: Use GPF_DMA32 for firmware swap memory. This fixes OS crash when using QCA 9984 NIC on x86-64 system without vt-d enabled. Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880. All tests were with CT firmware. Signed-off-by: Ben Greear <gree...@candelatech.com> drivers/net/wireless/ath/ath10k/wmi.c index e20aa39..727b3aa 100644 @@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; -vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); +vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32); if (!vaddr) num_units /= 2; } It sometimes works, on some systems, but then it also often fails with the splat below, especially with several NICs in the system. I suppose it's failing sometimes because the BUG() will trigger when trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured. That shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing kzalloc(..., ... | GFP_DMA32) anyway. CONFIG_ZONE_DMA32 is enabled in my .config. pool_size is relatively large (maybe 256k or so). If it's 256k, why allocate through the slab allocator? Why not alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN, get_order(pool_size))? I really don't understand the (subtle?) difference between alloc_pages and kzalloc, but I will give your suggestion a try and see if it works. If you have time, maybe you could take a look at drivers/net/wireless/ath/ath10k/wmi.c in the ath10k_wmi_alloc_chunk method to see if you notice any problems with using alloc_pages there? Thanks for the suggestion. -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Question on kzalloc and GFP_DMA32
On 09/28/2016 02:11 PM, David Rientjes wrote: On Tue, 27 Sep 2016, Ben Greear wrote: I have been running this patch for a while: ath10k: Use GPF_DMA32 for firmware swap memory. This fixes OS crash when using QCA 9984 NIC on x86-64 system without vt-d enabled. Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880. All tests were with CT firmware. Signed-off-by: Ben Greear drivers/net/wireless/ath/ath10k/wmi.c index e20aa39..727b3aa 100644 @@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; -vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); +vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32); if (!vaddr) num_units /= 2; } It sometimes works, on some systems, but then it also often fails with the splat below, especially with several NICs in the system. I suppose it's failing sometimes because the BUG() will trigger when trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured. That shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing kzalloc(..., ... | GFP_DMA32) anyway. CONFIG_ZONE_DMA32 is enabled in my .config. pool_size is relatively large (maybe 256k or so). If it's 256k, why allocate through the slab allocator? Why not alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN, get_order(pool_size))? I really don't understand the (subtle?) difference between alloc_pages and kzalloc, but I will give your suggestion a try and see if it works. If you have time, maybe you could take a look at drivers/net/wireless/ath/ath10k/wmi.c in the ath10k_wmi_alloc_chunk method to see if you notice any problems with using alloc_pages there? Thanks for the suggestion. -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Question on kzalloc and GFP_DMA32
I have been running this patch for a while: ath10k: Use GPF_DMA32 for firmware swap memory. This fixes OS crash when using QCA 9984 NIC on x86-64 system without vt-d enabled. Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880. All tests were with CT firmware. Signed-off-by: Ben Greear <gree...@candelatech.com> drivers/net/wireless/ath/ath10k/wmi.c index e20aa39..727b3aa 100644 @@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; -vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); +vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32); if (!vaddr) num_units /= 2; } It sometimes works, on some systems, but then it also often fails with the splat below, especially with several NICs in the system. pool_size is relatively large (maybe 256k or so). Any idea for a more proper way to do this? gfp: 4 [ cut here ] kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508! invalid opcode: [#1] PREEMPT SMP Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6 CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16 Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013 Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core] task: 880036433a00 ti: 88003644 task.ti: 88003644 RIP: 0010:[] [] new_slab+0x39a/0x410 RSP: 0018:880036443b58 EFLAGS: 00010092 RAX: 0006 RBX: 024082c4 RCX: RDX: 0006 RSI: 88021e30dd08 RDI: 88021e30dd08 RBP: 880036443b90 R08: R09: R10: R11: 0372 R12: 88021dc01200 R13: 88021dc00cc0 R14: 88021dc01200 R15: 0001 FS: () GS:88021e30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f3e65c1c730 CR3: 01e06000 CR4: 001406e0 Stack: 8127a4fc 0a01ff10 024082c4 88021dc01200 88021dc00cc0 88021dc01200 0001 880036443c58 81247ac6 88021e31b360 880036433a00 880036433a00 Call Trace: [] ? __d_lookup+0x9c/0x160 [] ___slab_alloc+0x396/0x4a0 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] ? alloc_kmem_pages+0x9/0x10 [] ? kmalloc_order+0x13/0x40 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] __slab_alloc.isra.72+0x26/0x40 [] __kmalloc+0x147/0x1b0 [] ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] ? dequeue_entity+0x261/0xac0 [] process_one_work+0x148/0x420 [] worker_thread+0x49/0x480 [] ? rescuer_thread+0x330/0x330 [] kthread+0xc4/0xe0 [] ret_from_fork+0x1f/0x40 [] ? kthread_create_on_node+0x170/0x170 Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18 RIP [] new_slab+0x39a/0x410 RSP ---[ end trace ea3b0043b2911d93 ]--- static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node) { if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK); BUG(); } return allocate_slab(s, flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node); } Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Question on kzalloc and GFP_DMA32
I have been running this patch for a while: ath10k: Use GPF_DMA32 for firmware swap memory. This fixes OS crash when using QCA 9984 NIC on x86-64 system without vt-d enabled. Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880. All tests were with CT firmware. Signed-off-by: Ben Greear drivers/net/wireless/ath/ath10k/wmi.c index e20aa39..727b3aa 100644 @@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; -vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); +vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32); if (!vaddr) num_units /= 2; } It sometimes works, on some systems, but then it also often fails with the splat below, especially with several NICs in the system. pool_size is relatively large (maybe 256k or so). Any idea for a more proper way to do this? gfp: 4 [ cut here ] kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508! invalid opcode: [#1] PREEMPT SMP Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6 CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16 Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013 Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core] task: 880036433a00 ti: 88003644 task.ti: 88003644 RIP: 0010:[] [] new_slab+0x39a/0x410 RSP: 0018:880036443b58 EFLAGS: 00010092 RAX: 0006 RBX: 024082c4 RCX: RDX: 0006 RSI: 88021e30dd08 RDI: 88021e30dd08 RBP: 880036443b90 R08: R09: R10: R11: 0372 R12: 88021dc01200 R13: 88021dc00cc0 R14: 88021dc01200 R15: 0001 FS: () GS:88021e30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f3e65c1c730 CR3: 01e06000 CR4: 001406e0 Stack: 8127a4fc 0a01ff10 024082c4 88021dc01200 88021dc00cc0 88021dc01200 0001 880036443c58 81247ac6 88021e31b360 880036433a00 880036433a00 Call Trace: [] ? __d_lookup+0x9c/0x160 [] ___slab_alloc+0x396/0x4a0 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] ? alloc_kmem_pages+0x9/0x10 [] ? kmalloc_order+0x13/0x40 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] __slab_alloc.isra.72+0x26/0x40 [] __kmalloc+0x147/0x1b0 [] ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [] ? dequeue_entity+0x261/0xac0 [] process_one_work+0x148/0x420 [] worker_thread+0x49/0x480 [] ? rescuer_thread+0x330/0x330 [] kthread+0xc4/0xe0 [] ret_from_fork+0x1f/0x40 [] ? kthread_create_on_node+0x170/0x170 Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18 RIP [] new_slab+0x39a/0x410 RSP ---[ end trace ea3b0043b2911d93 ]--- static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node) { if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK); BUG(); } return allocate_slab(s, flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node); } Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns
On 06/08/2016 08:56 AM, Prasun Maiti wrote: Please help me how to test this one?? It will be great to me if you help me. I do not have time or interest, sorry. You basically need access to firmware source to do any useful development on this driver in my opinion, and I do not have access to that source. Why are you so concerned about the warning anyway? Thanks, Ben On Wed, Jun 8, 2016 at 9:21 PM, Ben Greear <gree...@candelatech.com> wrote: On 06/08/2016 08:46 AM, Prasun Maiti wrote: Please tell me if I mention that this code is untested in commit log, then could you check the code kindly and also help me to fix this type of warning? In my experience, ath6kl has very fragile and buggy firmware, so I would not add any new API to it unless you have tested this thoroughly. Thanks, Ben On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle <kv...@qca.qualcomm.com> wrote: Prasun Maiti <prasunmait...@gmail.com> writes: I am not sure it works fine. Like ath6kl driver send another cmd to firmare, I have just filled up the cmd buffer with "tx_ant", and "rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd buffer to firmware. I have resend the patch as there are some errors in the previous patch. Let me know if any modifications are needed? I don't take untested code. In some special cases it might be ok to send untested code but even then it needs to be clearly stated in the commit log that it's untested. Please resend once you have tested this, I'm dropping this now. -- Kalle Valo -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns
On 06/08/2016 08:56 AM, Prasun Maiti wrote: Please help me how to test this one?? It will be great to me if you help me. I do not have time or interest, sorry. You basically need access to firmware source to do any useful development on this driver in my opinion, and I do not have access to that source. Why are you so concerned about the warning anyway? Thanks, Ben On Wed, Jun 8, 2016 at 9:21 PM, Ben Greear wrote: On 06/08/2016 08:46 AM, Prasun Maiti wrote: Please tell me if I mention that this code is untested in commit log, then could you check the code kindly and also help me to fix this type of warning? In my experience, ath6kl has very fragile and buggy firmware, so I would not add any new API to it unless you have tested this thoroughly. Thanks, Ben On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle wrote: Prasun Maiti writes: I am not sure it works fine. Like ath6kl driver send another cmd to firmare, I have just filled up the cmd buffer with "tx_ant", and "rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd buffer to firmware. I have resend the patch as there are some errors in the previous patch. Let me know if any modifications are needed? I don't take untested code. In some special cases it might be ok to send untested code but even then it needs to be clearly stated in the commit log that it's untested. Please resend once you have tested this, I'm dropping this now. -- Kalle Valo -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns
On 06/08/2016 08:46 AM, Prasun Maiti wrote: Please tell me if I mention that this code is untested in commit log, then could you check the code kindly and also help me to fix this type of warning? In my experience, ath6kl has very fragile and buggy firmware, so I would not add any new API to it unless you have tested this thoroughly. Thanks, Ben On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle <kv...@qca.qualcomm.com> wrote: Prasun Maiti <prasunmait...@gmail.com> writes: I am not sure it works fine. Like ath6kl driver send another cmd to firmare, I have just filled up the cmd buffer with "tx_ant", and "rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd buffer to firmware. I have resend the patch as there are some errors in the previous patch. Let me know if any modifications are needed? I don't take untested code. In some special cases it might be ok to send untested code but even then it needs to be clearly stated in the commit log that it's untested. Please resend once you have tested this, I'm dropping this now. -- Kalle Valo -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns
On 06/08/2016 08:46 AM, Prasun Maiti wrote: Please tell me if I mention that this code is untested in commit log, then could you check the code kindly and also help me to fix this type of warning? In my experience, ath6kl has very fragile and buggy firmware, so I would not add any new API to it unless you have tested this thoroughly. Thanks, Ben On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle wrote: Prasun Maiti writes: I am not sure it works fine. Like ath6kl driver send another cmd to firmare, I have just filled up the cmd buffer with "tx_ant", and "rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd buffer to firmware. I have resend the patch as there are some errors in the previous patch. Let me know if any modifications are needed? I don't take untested code. In some special cases it might be ok to send untested code but even then it needs to be clearly stated in the commit log that it's untested. Please resend once you have tested this, I'm dropping this now. -- Kalle Valo -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.
On 05/13/2016 11:21 AM, David Miller wrote: From: Ben Greear <gree...@candelatech.com> Date: Fri, 13 May 2016 09:57:19 -0700 How do you feel about a new socket-option to allow a socket to request the old veth behaviour? I depend upon the opinions of the experts who work upstream on and maintain these components, since it is an area I am not so familiar with. Generally speaking asking me directly for opinions on matters like this isn't the way to go, in fact I kind of find it irritating. It can't all be on me. Fair enough, thanks for your time. Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.
On 05/13/2016 11:21 AM, David Miller wrote: From: Ben Greear Date: Fri, 13 May 2016 09:57:19 -0700 How do you feel about a new socket-option to allow a socket to request the old veth behaviour? I depend upon the opinions of the experts who work upstream on and maintain these components, since it is an area I am not so familiar with. Generally speaking asking me directly for opinions on matters like this isn't the way to go, in fact I kind of find it irritating. It can't all be on me. Fair enough, thanks for your time. Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.
Mr Miller: How do you feel about a new socket-option to allow a socket to request the old veth behaviour? Thanks, Ben On 04/30/2016 10:30 PM, Willy Tarreau wrote: On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote: On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: Consider: - App A sends out corrupt packets 50% of the time and discards inbound data. (...) How can you make a generic app C know how to do this? The path could be, for instance: eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC There are no sockets on vethB, but it does need to have special behaviour to elide csums. Even if appC is hacked to know how to twiddle some thing on it's veth port, mucking with vethD will have no effect on vethB. With regard to your example above, why would A corrupt packets? My guess: 1) It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums, so just enabling checksumming adds no useful protection.) I agree with Ben here, what he needs is the ability for userspace to be trusted when *forwarding* a packet. Ideally you'd only want to receive the csum status per packet on the packet socket and pass the same value on the vethA interface, with this status being kept when the packet reaches vethB. If A purposely corrupts packet, it's A's problem. It's similar to designing a NIC which intentionally corrupts packets and reports "checksum good". The real issue is that in order to do things right, the userspace bridge (here, "A") would really need to pass this status. In Ben's case as he says, bad checksum packets are dropped before reaching A, so that simplifies the process quite a bit and that might be what causes some confusion, but ideally we'd rather have recvmsg() and sendmsg() with these flags. I faced the exact same issue 3 years ago when playing with netmap, it was slow as hell because it would lose all checksum information when packets were passing through userland, resulting in GRO/GSO etc being disabled, and had to modify it to let userland preserve it. That's especially important when you have to deal with possibly corrupted packets not yet detected in the chain because the NIC did not validate their checksums. Willy -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.
Mr Miller: How do you feel about a new socket-option to allow a socket to request the old veth behaviour? Thanks, Ben On 04/30/2016 10:30 PM, Willy Tarreau wrote: On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote: On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: Consider: - App A sends out corrupt packets 50% of the time and discards inbound data. (...) How can you make a generic app C know how to do this? The path could be, for instance: eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC There are no sockets on vethB, but it does need to have special behaviour to elide csums. Even if appC is hacked to know how to twiddle some thing on it's veth port, mucking with vethD will have no effect on vethB. With regard to your example above, why would A corrupt packets? My guess: 1) It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums, so just enabling checksumming adds no useful protection.) I agree with Ben here, what he needs is the ability for userspace to be trusted when *forwarding* a packet. Ideally you'd only want to receive the csum status per packet on the packet socket and pass the same value on the vethA interface, with this status being kept when the packet reaches vethB. If A purposely corrupts packet, it's A's problem. It's similar to designing a NIC which intentionally corrupts packets and reports "checksum good". The real issue is that in order to do things right, the userspace bridge (here, "A") would really need to pass this status. In Ben's case as he says, bad checksum packets are dropped before reaching A, so that simplifies the process quite a bit and that might be what causes some confusion, but ideally we'd rather have recvmsg() and sendmsg() with these flags. I faced the exact same issue 3 years ago when playing with netmap, it was slow as hell because it would lose all checksum information when packets were passing through userland, resulting in GRO/GSO etc being disabled, and had to modify it to let userland preserve it. That's especially important when you have to deal with possibly corrupted packets not yet detected in the chain because the NIC did not validate their checksums. Willy -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 5:52 PM, Ben Greear <gree...@candelatech.com> wrote: Good point, so if you had: eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <-> userspace-stub <->eth1 and user-space hub enabled this elide flag, things would work, right? Then, it seems like what we need is a way to tell the kernel router/bridge logic to follow elide signals in packets coming from veth. I'm not sure what the best way to do this is because I'm less familiar with conventions in that part of the kernel, but assuming there's a way to do this, would it be acceptable? You cannot receive on one veth without transmitting on the other, so I think the elide csum logic can go on the raw-socket, and apply to packets in the transmit-from-user-space direction. Just allowing the socket to make the veth behave like it used to before this patch in question should be good enough, since that worked for us for years. So, just an option to modify the ip_summed for pkts sent on a socket is probably sufficient. I don't think this is right. Consider: - App A sends out corrupt packets 50% of the time and discards inbound data. - App B doesn't care about corrupt packets and is happy to receive them and has some way of dealing with them (special case) - App C is a regular app, say nc or something. In your world, where A decides what happens to data it transmits, then A<--veth-->B and A<---wire-->B will have the same behaviour but A<-- veth --> C and A<-- wire --> C will have _different_ behaviour: C will behave incorrectly if it's connected over veth but correctly if connected with a wire. That is a bug. Since A cannot know what the app it's talking to will desire, I argue that both sides of a message must be opted in to this optimization. How can you make a generic app C know how to do this? The path could be, for instance: eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC There are no sockets on vethB, but it does need to have special behaviour to elide csums. Even if appC is hacked to know how to twiddle some thing on it's veth port, mucking with vethD will have no effect on vethB. With regard to your example above, why would A corrupt packets? My guess: 1) It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums, so just enabling checksumming adds no useful protection.) 2) It means to corrupt frames. In that case, someone must expect that C should receive incorrect frames, otherwise why bother making App-A corrupt them in the first place? 3) You are explicitly trying to test the kernel checksum logic, so you want the kernel to detect the bad checksum and throw away the packet. In this case, just don't set the socket option in appA to elide checksums and the packet will be thrown away. Any other cases you can think of? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 5:52 PM, Ben Greear wrote: Good point, so if you had: eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <-> userspace-stub <->eth1 and user-space hub enabled this elide flag, things would work, right? Then, it seems like what we need is a way to tell the kernel router/bridge logic to follow elide signals in packets coming from veth. I'm not sure what the best way to do this is because I'm less familiar with conventions in that part of the kernel, but assuming there's a way to do this, would it be acceptable? You cannot receive on one veth without transmitting on the other, so I think the elide csum logic can go on the raw-socket, and apply to packets in the transmit-from-user-space direction. Just allowing the socket to make the veth behave like it used to before this patch in question should be good enough, since that worked for us for years. So, just an option to modify the ip_summed for pkts sent on a socket is probably sufficient. I don't think this is right. Consider: - App A sends out corrupt packets 50% of the time and discards inbound data. - App B doesn't care about corrupt packets and is happy to receive them and has some way of dealing with them (special case) - App C is a regular app, say nc or something. In your world, where A decides what happens to data it transmits, then A<--veth-->B and A<---wire-->B will have the same behaviour but A<-- veth --> C and A<-- wire --> C will have _different_ behaviour: C will behave incorrectly if it's connected over veth but correctly if connected with a wire. That is a bug. Since A cannot know what the app it's talking to will desire, I argue that both sides of a message must be opted in to this optimization. How can you make a generic app C know how to do this? The path could be, for instance: eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC There are no sockets on vethB, but it does need to have special behaviour to elide csums. Even if appC is hacked to know how to twiddle some thing on it's veth port, mucking with vethD will have no effect on vethB. With regard to your example above, why would A corrupt packets? My guess: 1) It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums, so just enabling checksumming adds no useful protection.) 2) It means to corrupt frames. In that case, someone must expect that C should receive incorrect frames, otherwise why bother making App-A corrupt them in the first place? 3) You are explicitly trying to test the kernel checksum logic, so you want the kernel to detect the bad checksum and throw away the packet. In this case, just don't set the socket option in appA to elide checksums and the packet will be thrown away. Any other cases you can think of? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 02:36 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 5:29 PM, Ben Greear <gree...@candelatech.com> wrote: On 04/30/2016 02:13 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear <gree...@candelatech.com> wrote: On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? It seems like that only the receiver should decide whether or not to checksum packets on the veth, not the sender. How about: We could add a receiving socket option for "don't checksum packets received from a veth when the other side has marked them as elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending socket option for "mark all data sent via this socket to a veth as elide-checksum-suggested". So the process would be: Writer: 1. open read socket 2. open write socket, with option elide-checksum-for-veth-suggested 3. write data Reader: 1. open read socket with "follow-elide-checksum-suggestions-on-veth" 2. read data The kernel / module would then need to persist the flag on all packets that traverse a veth, and drop these data when they leave the veth module. I'm not sure this works completely. In my app, the packet flow might be: eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> [kernel router/bridge logic ...] <-> eth1 Good point, so if you had: eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <-> userspace-stub <->eth1 and user-space hub enabled this elide flag, things would work, right? Then, it seems like what we need is a way to tell the kernel router/bridge logic to follow elide signals in packets coming from veth. I'm not sure what the best way to do this is because I'm less familiar with conventions in that part of the kernel, but assuming there's a way to do this, would it be acceptable? You cannot receive on one veth without transmitting on the other, so I think the elide csum logic can go on the raw-socket, and apply to packets in the transmit-from-user-space direction. Just allowing the socket to make the veth behave like it used to before this patch in question should be good enough, since that worked for us for years. So, just an option to modify the ip_summed for pkts sent on a socket is probably sufficient. There may be no sockets on the vethB port. And reader/writer is not a good way to look at it since I am implementing a bi-directional bridge in user-space and each packet-socket is for both rx and tx. Sure, but we could model a bidrectional connection as two unidirectional sockets for our discussions here, right? Best not to I think, you want to make sure that one socket can correctly handle tx and rx. As long as that works, then using uni-directional sockets should work too. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 02:36 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 5:29 PM, Ben Greear wrote: On 04/30/2016 02:13 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear wrote: On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? It seems like that only the receiver should decide whether or not to checksum packets on the veth, not the sender. How about: We could add a receiving socket option for "don't checksum packets received from a veth when the other side has marked them as elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending socket option for "mark all data sent via this socket to a veth as elide-checksum-suggested". So the process would be: Writer: 1. open read socket 2. open write socket, with option elide-checksum-for-veth-suggested 3. write data Reader: 1. open read socket with "follow-elide-checksum-suggestions-on-veth" 2. read data The kernel / module would then need to persist the flag on all packets that traverse a veth, and drop these data when they leave the veth module. I'm not sure this works completely. In my app, the packet flow might be: eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> [kernel router/bridge logic ...] <-> eth1 Good point, so if you had: eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <-> userspace-stub <->eth1 and user-space hub enabled this elide flag, things would work, right? Then, it seems like what we need is a way to tell the kernel router/bridge logic to follow elide signals in packets coming from veth. I'm not sure what the best way to do this is because I'm less familiar with conventions in that part of the kernel, but assuming there's a way to do this, would it be acceptable? You cannot receive on one veth without transmitting on the other, so I think the elide csum logic can go on the raw-socket, and apply to packets in the transmit-from-user-space direction. Just allowing the socket to make the veth behave like it used to before this patch in question should be good enough, since that worked for us for years. So, just an option to modify the ip_summed for pkts sent on a socket is probably sufficient. There may be no sockets on the vethB port. And reader/writer is not a good way to look at it since I am implementing a bi-directional bridge in user-space and each packet-socket is for both rx and tx. Sure, but we could model a bidrectional connection as two unidirectional sockets for our discussions here, right? Best not to I think, you want to make sure that one socket can correctly handle tx and rx. As long as that works, then using uni-directional sockets should work too. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 02:13 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear <gree...@candelatech.com> wrote: On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? It seems like that only the receiver should decide whether or not to checksum packets on the veth, not the sender. How about: We could add a receiving socket option for "don't checksum packets received from a veth when the other side has marked them as elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending socket option for "mark all data sent via this socket to a veth as elide-checksum-suggested". So the process would be: Writer: 1. open read socket 2. open write socket, with option elide-checksum-for-veth-suggested 3. write data Reader: 1. open read socket with "follow-elide-checksum-suggestions-on-veth" 2. read data The kernel / module would then need to persist the flag on all packets that traverse a veth, and drop these data when they leave the veth module. I'm not sure this works completely. In my app, the packet flow might be: eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> [kernel router/bridge logic ...] <-> eth1 There may be no sockets on the vethB port. And reader/writer is not a good way to look at it since I am implementing a bi-directional bridge in user-space and each packet-socket is for both rx and tx. Also, I might want to send raw frames that do have broken checksums (lets assume a real NIC, not veth), and I want them to hit the wire with those bad checksums. How do I configure the checksumming in this case? Correct me if I'm wrong but I think this is already possible now. You can have packets with incorrect checksum hitting the wire as is. What you cannot do is instruct the receiving end to ignore the checksum from the sending end when using a physical device (and something I think we should mimic on the sending device). Yes, it does work currently (or, last I checked)...I just want to make sure it keeps working. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 02:13 PM, Vijay Pandurangan wrote: On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear wrote: On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? It seems like that only the receiver should decide whether or not to checksum packets on the veth, not the sender. How about: We could add a receiving socket option for "don't checksum packets received from a veth when the other side has marked them as elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending socket option for "mark all data sent via this socket to a veth as elide-checksum-suggested". So the process would be: Writer: 1. open read socket 2. open write socket, with option elide-checksum-for-veth-suggested 3. write data Reader: 1. open read socket with "follow-elide-checksum-suggestions-on-veth" 2. read data The kernel / module would then need to persist the flag on all packets that traverse a veth, and drop these data when they leave the veth module. I'm not sure this works completely. In my app, the packet flow might be: eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> [kernel router/bridge logic ...] <-> eth1 There may be no sockets on the vethB port. And reader/writer is not a good way to look at it since I am implementing a bi-directional bridge in user-space and each packet-socket is for both rx and tx. Also, I might want to send raw frames that do have broken checksums (lets assume a real NIC, not veth), and I want them to hit the wire with those bad checksums. How do I configure the checksumming in this case? Correct me if I'm wrong but I think this is already possible now. You can have packets with incorrect checksum hitting the wire as is. What you cannot do is instruct the receiving end to ignore the checksum from the sending end when using a physical device (and something I think we should mimic on the sending device). Yes, it does work currently (or, last I checked)...I just want to make sure it keeps working. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? Also, I might want to send raw frames that do have broken checksums (lets assume a real NIC, not veth), and I want them to hit the wire with those bad checksums. How do I configure the checksumming in this case? Thanks, Ben Tom On Sat, Apr 30, 2016 at 12:40 PM, Ben Greear <gree...@candelatech.com> wrote: On 04/30/2016 11:33 AM, Ben Hutchings wrote: On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote: Hello, http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. [...] It really is. Even though the old behaviour was a bug (raw packets should not be changed), if there are real applications that depend on that then we have to keep those applications working somehow. To be honest, I fail to see why the old behaviour is a bug when sending raw packets from user-space. If raw packets should not be changed, then we need some way to specify what the checksum setting is to begin with, otherwise, user-space has not enough control. A socket option for new programs, and sysctl configurable defaults for raw sockets for old binary programs would be sufficient I think. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 12:54 PM, Tom Herbert wrote: We've put considerable effort into cleaning up the checksum interface to make it as unambiguous as possible, please be very careful to follow it. Broken checksum processing is really hard to detect and debug. CHECKSUM_UNNECESSARY means that some number of _specific_ checksums (indicated by csum_level) have been verified to be correct in a packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is never right. If CHECKSUM_UNNECESSARY is set in such a manner but the checksum it would refer to has not been verified and is incorrect this is a major bug. Suppose I know that the packet received on a packet-socket has already been verified by a NIC that supports hardware checksumming. Then, I want to transmit it on a veth interface using a second packet socket. I do not want veth to recalculate the checksum on transmit, nor to validate it on the peer veth on receive, because I do not want to waste the CPU cycles. I am assuming that my app is not accidentally corrupting frames, so the checksum can never be bad. How should the checksumming be configured for the packets going into the packet-socket from user-space? Also, I might want to send raw frames that do have broken checksums (lets assume a real NIC, not veth), and I want them to hit the wire with those bad checksums. How do I configure the checksumming in this case? Thanks, Ben Tom On Sat, Apr 30, 2016 at 12:40 PM, Ben Greear wrote: On 04/30/2016 11:33 AM, Ben Hutchings wrote: On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote: Hello, http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. [...] It really is. Even though the old behaviour was a bug (raw packets should not be changed), if there are real applications that depend on that then we have to keep those applications working somehow. To be honest, I fail to see why the old behaviour is a bug when sending raw packets from user-space. If raw packets should not be changed, then we need some way to specify what the checksum setting is to begin with, otherwise, user-space has not enough control. A socket option for new programs, and sysctl configurable defaults for raw sockets for old binary programs would be sufficient I think. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 11:33 AM, Ben Hutchings wrote: On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote: Hello, http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. [...] It really is. Even though the old behaviour was a bug (raw packets should not be changed), if there are real applications that depend on that then we have to keep those applications working somehow. To be honest, I fail to see why the old behaviour is a bug when sending raw packets from user-space. If raw packets should not be changed, then we need some way to specify what the checksum setting is to begin with, otherwise, user-space has not enough control. A socket option for new programs, and sysctl configurable defaults for raw sockets for old binary programs would be sufficient I think. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/30/2016 11:33 AM, Ben Hutchings wrote: On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote: Hello, http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. [...] It really is. Even though the old behaviour was a bug (raw packets should not be changed), if there are real applications that depend on that then we have to keep those applications working somehow. To be honest, I fail to see why the old behaviour is a bug when sending raw packets from user-space. If raw packets should not be changed, then we need some way to specify what the checksum setting is to begin with, otherwise, user-space has not enough control. A socket option for new programs, and sysctl configurable defaults for raw sockets for old binary programs would be sufficient I think. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/28/2016 03:29 AM, Sabrina Dubroca wrote: Hello, 2016-04-27, 17:14:44 -0700, Ben Greear wrote: On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote: Hi Ben, On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote: On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html [...] OK, I'll drop this for now. The fall out from not having this patch is in my opinion a bigger fallout than not having this patch. This patch fixes silent data corruption vs. the problem Ben Greear is talking about, which might not be that a common usage. What do others think? Bye, Hannes This patch from Cong Wang seems to fix the regression for me, I think it should be added and tested in the main tree, and then apply them to stable as a pair. http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. If you capture packets on a device with checksum offloading enabled, the TCP/UDP checksum isn't filled. veth also behaves that way. What the "veth: don't modify ip_summed" patch does is enable proper checksum validation on veth. This really was a bug in veth. Cong's patch would also break cases where we choose to inject packets with invalid checksums, and they would now be accepted as correct. Your use case is invalid, it just happened to work because of a bug. If you want the stack to fill checksums so that you want capture and reinject packets, you have to disable checksum offloading (or compute the checksum yourself in userspace). Disabling checksum offloading or computing in user-space (and then recomputing in veth to verify the checksum?) is a huge performance loss. Maybe we could add a socket option to enable Cong's patch on a per-socket basis? That way my use-case can still work and you can have this new behaviour by default? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/28/2016 03:29 AM, Sabrina Dubroca wrote: Hello, 2016-04-27, 17:14:44 -0700, Ben Greear wrote: On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote: Hi Ben, On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote: On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html [...] OK, I'll drop this for now. The fall out from not having this patch is in my opinion a bigger fallout than not having this patch. This patch fixes silent data corruption vs. the problem Ben Greear is talking about, which might not be that a common usage. What do others think? Bye, Hannes This patch from Cong Wang seems to fix the regression for me, I think it should be added and tested in the main tree, and then apply them to stable as a pair. http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a Actually, no, this is not really a regression. If you capture packets on a device with checksum offloading enabled, the TCP/UDP checksum isn't filled. veth also behaves that way. What the "veth: don't modify ip_summed" patch does is enable proper checksum validation on veth. This really was a bug in veth. Cong's patch would also break cases where we choose to inject packets with invalid checksums, and they would now be accepted as correct. Your use case is invalid, it just happened to work because of a bug. If you want the stack to fill checksums so that you want capture and reinject packets, you have to disable checksum offloading (or compute the checksum yourself in userspace). Disabling checksum offloading or computing in user-space (and then recomputing in veth to verify the checksum?) is a huge performance loss. Maybe we could add a socket option to enable Cong's patch on a per-socket basis? That way my use-case can still work and you can have this new behaviour by default? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote: Hi Ben, On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote: On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html [...] OK, I'll drop this for now. The fall out from not having this patch is in my opinion a bigger fallout than not having this patch. This patch fixes silent data corruption vs. the problem Ben Greear is talking about, which might not be that a common usage. What do others think? Bye, Hannes This patch from Cong Wang seems to fix the regression for me, I think it should be added and tested in the main tree, and then apply them to stable as a pair. http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index da1ae0e..f8cc758 100644 (file) --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1926,6 +1926,7 @@ retry: goto out_unlock; } + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = sk->sk_priority; @@ -2352,6 +2353,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb, ph.raw = frame; + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = po->sk.sk_priority; @@ -2776,6 +2778,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) goto out_free; } + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = sk->sk_priority; Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote: Hi Ben, On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote: On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote: On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html [...] OK, I'll drop this for now. The fall out from not having this patch is in my opinion a bigger fallout than not having this patch. This patch fixes silent data corruption vs. the problem Ben Greear is talking about, which might not be that a common usage. What do others think? Bye, Hannes This patch from Cong Wang seems to fix the regression for me, I think it should be added and tested in the main tree, and then apply them to stable as a pair. http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index da1ae0e..f8cc758 100644 (file) --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1926,6 +1926,7 @@ retry: goto out_unlock; } + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = sk->sk_priority; @@ -2352,6 +2353,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb, ph.raw = frame; + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = po->sk.sk_priority; @@ -2776,6 +2778,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) goto out_free; } + skb->ip_summed = CHECKSUM_UNNECESSARY; skb->protocol = proto; skb->dev = dev; skb->priority = sk->sk_priority; Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html Thanks, Ben -- From: Vijay Pandurangan <vij...@vijayp.ca> [ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ] Packets that arrive from real hardware devices have ip_summed == CHECKSUM_UNNECESSARY if the hardware verified the checksums, or CHECKSUM_NONE if the packet is bad or it was unable to verify it. The current version of veth will replace CHECKSUM_NONE with CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to a veth device to be delivered to the application. This caused applications at Twitter to receive corrupt data when network hardware was corrupting packets. We believe this was added as an optimization to skip computing and verifying checksums for communication between containers. However, locally generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as written does nothing for them. As far as we can tell, after removing this code, these packets are transmitted from one stack to another unmodified (tcpdump shows invalid checksums on both sides, as expected), and they are delivered correctly to applications. We didn’t test every possible network configuration, but we tried a few common ones such as bridging containers, using NAT between the host and a container, and routing from hardware devices to containers. We have effectively deployed this in production at Twitter (by disabling RX checksum offloading on veth devices). This code dates back to the first version of the driver, commit ("[NET]: Virtual ethernet device driver"), so I suspect this bug occurred mostly because the driver API has evolved significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix packet checksumming") (in December 2010) fixed this for packets that get created locally and sent to hardware devices, by not changing CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming in from hardware devices. Co-authored-by: Evan Jones <e...@evanjones.ca> Signed-off-by: Evan Jones <e...@evanjones.ca> Cc: Nicolas Dichtel <nicolas.dich...@6wind.com> Cc: Phil Sutter <p...@nwl.cc> Cc: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> Cc: net...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Vijay Pandurangan <vij...@vijayp.ca> Acked-by: Cong Wang <cw...@twopensource.com> Signed-off-by: David S. Miller <da...@davemloft.net> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <b...@decadent.org.uk> --- drivers/net/veth.c | 6 -- 1 file changed, 6 deletions(-) --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b stats = this_cpu_ptr(priv->stats); rcv_stats = this_cpu_ptr(rcv_priv->stats); - /* don't change ip_summed == CHECKSUM_PARTIAL, as that - will cause bad checksum on forwarded packets */ - if (skb->ip_summed == CHECKSUM_NONE && - rcv->features & NETIF_F_RXCSUM) - skb->ip_summed = CHECKSUM_UNNECESSARY; length = skb->len; if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS) -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
On 04/26/2016 04:02 PM, Ben Hutchings wrote: 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html Thanks, Ben -- From: Vijay Pandurangan [ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ] Packets that arrive from real hardware devices have ip_summed == CHECKSUM_UNNECESSARY if the hardware verified the checksums, or CHECKSUM_NONE if the packet is bad or it was unable to verify it. The current version of veth will replace CHECKSUM_NONE with CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to a veth device to be delivered to the application. This caused applications at Twitter to receive corrupt data when network hardware was corrupting packets. We believe this was added as an optimization to skip computing and verifying checksums for communication between containers. However, locally generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as written does nothing for them. As far as we can tell, after removing this code, these packets are transmitted from one stack to another unmodified (tcpdump shows invalid checksums on both sides, as expected), and they are delivered correctly to applications. We didn’t test every possible network configuration, but we tried a few common ones such as bridging containers, using NAT between the host and a container, and routing from hardware devices to containers. We have effectively deployed this in production at Twitter (by disabling RX checksum offloading on veth devices). This code dates back to the first version of the driver, commit ("[NET]: Virtual ethernet device driver"), so I suspect this bug occurred mostly because the driver API has evolved significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix packet checksumming") (in December 2010) fixed this for packets that get created locally and sent to hardware devices, by not changing CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming in from hardware devices. Co-authored-by: Evan Jones Signed-off-by: Evan Jones Cc: Nicolas Dichtel Cc: Phil Sutter Cc: Toshiaki Makita Cc: net...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Vijay Pandurangan Acked-by: Cong Wang Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings --- drivers/net/veth.c | 6 -- 1 file changed, 6 deletions(-) --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b stats = this_cpu_ptr(priv->stats); rcv_stats = this_cpu_ptr(rcv_priv->stats); - /* don't change ip_summed == CHECKSUM_PARTIAL, as that - will cause bad checksum on forwarded packets */ - if (skb->ip_summed == CHECKSUM_NONE && - rcv->features & NETIF_F_RXCSUM) - skb->ip_summed = CHECKSUM_UNNECESSARY; length = skb->len; if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS) -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 01:29 PM, Eric Dumazet wrote: On Tue, 2016-04-12 at 13:23 -0700, Ben Greear wrote: It worked well enough for years that I didn't even know other algorithms were available. It was broken around 4.0 time, and I reported it to the list, and no one seemed to really care enough to do anything about it. I changed to reno and ignored the problem as well. It is trivially easy to see the regression when using ath10k NIC, and from this email thread, I guess other NICs have similar issues. Since it is so trivial, why don't you start a bisection ? I vaguely remember doing a bisect, but I can't find any email about that, so maybe I didn't. At any rate, it is somewhere between 3.17 and 4.0. From memory, it was between 3.19 and 4.0, but I am not certain of that. Neil's suggestion, from the thread below, is that it was likely: "605ad7f tcp: refine TSO autosizing" Here is previous email thread: https://www.mail-archive.com/netdev@vger.kernel.org/msg80803.html This one has a link to a pcap I made at the time: https://www.mail-archive.com/netdev@vger.kernel.org/msg80890.html I asked a capture, I did not say ' switch to Reno or whatever ', right ? Guessing is nice, but investigating and fixing is better. Do not assume that nothing can be done, please ? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 01:29 PM, Eric Dumazet wrote: On Tue, 2016-04-12 at 13:23 -0700, Ben Greear wrote: It worked well enough for years that I didn't even know other algorithms were available. It was broken around 4.0 time, and I reported it to the list, and no one seemed to really care enough to do anything about it. I changed to reno and ignored the problem as well. It is trivially easy to see the regression when using ath10k NIC, and from this email thread, I guess other NICs have similar issues. Since it is so trivial, why don't you start a bisection ? I vaguely remember doing a bisect, but I can't find any email about that, so maybe I didn't. At any rate, it is somewhere between 3.17 and 4.0. From memory, it was between 3.19 and 4.0, but I am not certain of that. Neil's suggestion, from the thread below, is that it was likely: "605ad7f tcp: refine TSO autosizing" Here is previous email thread: https://www.mail-archive.com/netdev@vger.kernel.org/msg80803.html This one has a link to a pcap I made at the time: https://www.mail-archive.com/netdev@vger.kernel.org/msg80890.html I asked a capture, I did not say ' switch to Reno or whatever ', right ? Guessing is nice, but investigating and fixing is better. Do not assume that nothing can be done, please ? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 01:17 PM, Eric Dumazet wrote: On Tue, 2016-04-12 at 13:11 -0700, Ben Greear wrote: On 04/12/2016 12:31 PM, Machani, Yaniv wrote: On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote: On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. Thanks Ben, this indeed seems to be the issue ! Switching to reno got me to max throughput instantly. I'm still looking through the thread you have shared, but from what I understand there is no planned fix for it ? I think at the time it was blamed on ath10k and no one cared to try to fix it. Or, maybe no one really uses CUBIC anymore? Either way, I have no plans to try to fix CUBIC, but maybe someone who knows this code better could give it a try. Well, cubic seems to work in many cases, assuming they are not too many drops. Assuming one flow can get nominal speed in few RTT is kind a dream, and so far nobody claimed a CC was able to do that, while still being fair and resilient. TCP CC are full of heuristics, and by definition heuristics that were working 6 years ago might need to be refreshed. We are still maintaining Cubic for sure. It worked well enough for years that I didn't even know other algorithms were available. It was broken around 4.0 time, and I reported it to the list, and no one seemed to really care enough to do anything about it. I changed to reno and ignored the problem as well. It is trivially easy to see the regression when using ath10k NIC, and from this email thread, I guess other NICs have similar issues. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 01:17 PM, Eric Dumazet wrote: On Tue, 2016-04-12 at 13:11 -0700, Ben Greear wrote: On 04/12/2016 12:31 PM, Machani, Yaniv wrote: On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote: On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. Thanks Ben, this indeed seems to be the issue ! Switching to reno got me to max throughput instantly. I'm still looking through the thread you have shared, but from what I understand there is no planned fix for it ? I think at the time it was blamed on ath10k and no one cared to try to fix it. Or, maybe no one really uses CUBIC anymore? Either way, I have no plans to try to fix CUBIC, but maybe someone who knows this code better could give it a try. Well, cubic seems to work in many cases, assuming they are not too many drops. Assuming one flow can get nominal speed in few RTT is kind a dream, and so far nobody claimed a CC was able to do that, while still being fair and resilient. TCP CC are full of heuristics, and by definition heuristics that were working 6 years ago might need to be refreshed. We are still maintaining Cubic for sure. It worked well enough for years that I didn't even know other algorithms were available. It was broken around 4.0 time, and I reported it to the list, and no one seemed to really care enough to do anything about it. I changed to reno and ignored the problem as well. It is trivially easy to see the regression when using ath10k NIC, and from this email thread, I guess other NICs have similar issues. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 12:31 PM, Machani, Yaniv wrote: On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote: On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. Thanks Ben, this indeed seems to be the issue ! Switching to reno got me to max throughput instantly. I'm still looking through the thread you have shared, but from what I understand there is no planned fix for it ? I think at the time it was blamed on ath10k and no one cared to try to fix it. Or, maybe no one really uses CUBIC anymore? Either way, I have no plans to try to fix CUBIC, but maybe someone who knows this code better could give it a try. Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 12:31 PM, Machani, Yaniv wrote: On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote: On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. Thanks Ben, this indeed seems to be the issue ! Switching to reno got me to max throughput instantly. I'm still looking through the thread you have shared, but from what I understand there is no planned fix for it ? I think at the time it was blamed on ath10k and no one cared to try to fix it. Or, maybe no one really uses CUBIC anymore? Either way, I have no plans to try to fix CUBIC, but maybe someone who knows this code better could give it a try. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: Hi, After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance degradation over Wi-Fi. In 3.14 kernel, TCP got to its max throughout after less than a second, while in the 4.4 it is taking ~20-30 seconds. UDP TX/RX and TCP RX performance is as expected. We are using a Beagle Bone Black and a WiLink8 device. Were there any related changes that might cause such behavior ? Kernel configuration and sysctl values were compared, but no significant differences have been found. If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. https://marc.info/?l=linux-netdev=144405216005715=2 Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: TCP reaching to maximum throughput after a long time
On 04/12/2016 07:52 AM, Eric Dumazet wrote: On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote: Hi, After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance degradation over Wi-Fi. In 3.14 kernel, TCP got to its max throughout after less than a second, while in the 4.4 it is taking ~20-30 seconds. UDP TX/RX and TCP RX performance is as expected. We are using a Beagle Bone Black and a WiLink8 device. Were there any related changes that might cause such behavior ? Kernel configuration and sysctl values were compared, but no significant differences have been found. If you are using 'Cubic' TCP congestion control, then please try something different. It was broken last I checked, at least when used with the ath10k driver. https://marc.info/?l=linux-netdev=144405216005715=2 Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Deadlock related to file permissions and/or cgroup, 4.4.6+
W O4.4.6+ #23 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. llvmpipe-2 D 880068d2fbc0 0 1356 1 0x 880068d2fbc0 00ff880069b5d3c8 88006bc9ff98 88006bc9ff80 880069b5cc08 880066bfcc00 880069b5cc00 880068d28000 880069b5cc00 8800682e7c60 880069b5ed50 8375c620 Call Trace: [] schedule+0xc3/0xe0 [] rwsem_down_read_failed+0x190/0x1f9 [] ? rt_mutex_futex_unlock+0x28/0x28 [] ? tlb_flush_mmu_free+0x24/0x6a [] ? match_held_lock+0x30/0x103 [] call_rwsem_down_read_failed+0x14/0x30 [] ? call_rwsem_down_read_failed+0x14/0x30 [] ? exit_signals+0x80/0x1ef [] ? percpu_down_read+0x60/0x72 [] exit_signals+0x80/0x1ef [] ? get_signal+0x8a2/0x8a2 [] ? __blocking_notifier_call_chain+0x2e/0x73 [] do_exit+0x224/0x1157 [] ? swapin_walk_pmd_entry+0x1c3/0x1c3 [] ? release_task+0x738/0x738 [] ? SyS_futex+0x152/0x1ee [] ? do_futex+0xb4d/0xb4d [] ? mark_held_locks+0x2d/0x90 [] ? lockdep_sys_exit+0x1a/0x91 [] ? lockdep_sys_exit_thunk+0x12/0x14 [] SyS_exit+0x1d/0x1d [] entry_SYSCALL_64_fastpath+0x16/0x7a 1 lock held by llvmpipe-2/1356: #0: (_threadgroup_rwsem){++}, at: [] exit_signals+0x80/0x1ef Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Deadlock related to file permissions and/or cgroup, 4.4.6+
W O4.4.6+ #23 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. llvmpipe-2 D 880068d2fbc0 0 1356 1 0x 880068d2fbc0 00ff880069b5d3c8 88006bc9ff98 88006bc9ff80 880069b5cc08 880066bfcc00 880069b5cc00 880068d28000 880069b5cc00 8800682e7c60 880069b5ed50 8375c620 Call Trace: [] schedule+0xc3/0xe0 [] rwsem_down_read_failed+0x190/0x1f9 [] ? rt_mutex_futex_unlock+0x28/0x28 [] ? tlb_flush_mmu_free+0x24/0x6a [] ? match_held_lock+0x30/0x103 [] call_rwsem_down_read_failed+0x14/0x30 [] ? call_rwsem_down_read_failed+0x14/0x30 [] ? exit_signals+0x80/0x1ef [] ? percpu_down_read+0x60/0x72 [] exit_signals+0x80/0x1ef [] ? get_signal+0x8a2/0x8a2 [] ? __blocking_notifier_call_chain+0x2e/0x73 [] do_exit+0x224/0x1157 [] ? swapin_walk_pmd_entry+0x1c3/0x1c3 [] ? release_task+0x738/0x738 [] ? SyS_futex+0x152/0x1ee [] ? do_futex+0xb4d/0xb4d [] ? mark_held_locks+0x2d/0x90 [] ? lockdep_sys_exit+0x1a/0x91 [] ? lockdep_sys_exit_thunk+0x12/0x14 [] SyS_exit+0x1d/0x1d [] entry_SYSCALL_64_fastpath+0x16/0x7a 1 lock held by llvmpipe-2/1356: #0: (_threadgroup_rwsem){++}, at: [] exit_signals+0x80/0x1ef Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/31/2016 05:46 PM, Herbert Xu wrote: On Thu, Mar 31, 2016 at 05:29:59PM +0200, Johannes Berg wrote: Does removing this completely disable the "-EEXIST" error? I can't say I fully understand the elasticity stuff in __rhashtable_insert_fast(). What EEXIST error are you talking about? The only one that can be returned on insertion is if you're explicitly checking for dups which clearly can't be the case for you. If you're talking about the EEXIST error due to a rehash then it is completely hidden from you by rhashtable_insert_rehash. If you actually meant EBUSY then yes this should prevent it from occurring, unless your chain-length exceeds 2^32. EEXIST was on removal, and was a symptom of the failure to insert, not really a problem itself. I reverted my revert (ie, back to rhashtable), added Johanne's patch to check insertion (and added my on pr_err there). I also added this: diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 38ef0be..c25b945 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -66,6 +66,7 @@ static const struct rhashtable_params sta_rht_params = { .nelem_hint = 3, /* start small */ + .insecure_elasticity = true, /* Disable chain-length checks. */ .automatic_shrinking = true, .head_offset = offsetof(struct sta_info, hash_node), .key_offset = offsetof(struct sta_info, addr), Now, my test case seems to pass, though I did have one strange issue before I put in the pr_err. I'm not sure if it was a hashtable issue or something else..but I have lots of something-else going on in this system, so I'd say that likely the patch above fixes rhashtable for my use case. I will of course let you know if I run into more issues that appear to be hashtable related! Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/31/2016 05:46 PM, Herbert Xu wrote: On Thu, Mar 31, 2016 at 05:29:59PM +0200, Johannes Berg wrote: Does removing this completely disable the "-EEXIST" error? I can't say I fully understand the elasticity stuff in __rhashtable_insert_fast(). What EEXIST error are you talking about? The only one that can be returned on insertion is if you're explicitly checking for dups which clearly can't be the case for you. If you're talking about the EEXIST error due to a rehash then it is completely hidden from you by rhashtable_insert_rehash. If you actually meant EBUSY then yes this should prevent it from occurring, unless your chain-length exceeds 2^32. EEXIST was on removal, and was a symptom of the failure to insert, not really a problem itself. I reverted my revert (ie, back to rhashtable), added Johanne's patch to check insertion (and added my on pr_err there). I also added this: diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 38ef0be..c25b945 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -66,6 +66,7 @@ static const struct rhashtable_params sta_rht_params = { .nelem_hint = 3, /* start small */ + .insecure_elasticity = true, /* Disable chain-length checks. */ .automatic_shrinking = true, .head_offset = offsetof(struct sta_info, hash_node), .key_offset = offsetof(struct sta_info, addr), Now, my test case seems to pass, though I did have one strange issue before I put in the pr_err. I'm not sure if it was a hashtable issue or something else..but I have lots of something-else going on in this system, so I'd say that likely the patch above fixes rhashtable for my use case. I will of course let you know if I run into more issues that appear to be hashtable related! Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/31/2016 12:46 AM, Johannes Berg wrote: On Wed, 2016-03-30 at 09:52 -0700, Ben Greear wrote: If someone can fix rhashtable, then great. I read some earlier comments [1] back when someone else reported similar problems, and the comments seemed to indicate that rhashtable was broken in this manner on purpose to protect against hashing attacks. If you are baking in this type of policy to what should be a basic data-type, then it is not useful for how it is being used in the mac80211 stack. [1] http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html That's not really saying it's purposely broken, that's more saying that Herbert didn't see a point in fixing a case that has awful behaviour already. However, I'm confused now - we can much more easily live with *insertion* failures, as the linked email indicates, than *deletion* failures, which I think you had indicated originally. Are you really seeing *deletion* failures? If there are in fact *deletion* failures then I think we really need to address those in rhashtable, no matter the worst-case behaviour of the hashing or keys, since we should be able to delete entries in order to get back to something reasonable. Looking at the code though, I don't actually see that happening. If you're seeing only *insertion* failures, which you indicated in the root of this thread, then I think for the general case in mac80211 we can live with that - we use a seeded jhash for the hash function, and since in the general case we cannot accept entries with identical MAC addresses to start with, it shouldn't be possible to run into this problem under normal use cases. I see insertion failure, and then later, if of course fails to delete as well since it was never inserted to begin with. There is no good way to deal with insertion error, so just need to fix the hashtable. In this case, I think perhaps you can just patch your local system with the many interfaces connecting to the same AP to add the parameter Herbert suggested (.insecure_elasticity = true in sta_rht_params). This is, after all, very much a case that "normal" operation doesn't even get close to. Old code, even stock kernels, could deal with this properly, so I think it should be fixed by default. I'll put rhash back in my tree and try that insecure option and see if it works. Thanks, Ben johannes -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/31/2016 12:46 AM, Johannes Berg wrote: On Wed, 2016-03-30 at 09:52 -0700, Ben Greear wrote: If someone can fix rhashtable, then great. I read some earlier comments [1] back when someone else reported similar problems, and the comments seemed to indicate that rhashtable was broken in this manner on purpose to protect against hashing attacks. If you are baking in this type of policy to what should be a basic data-type, then it is not useful for how it is being used in the mac80211 stack. [1] http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html That's not really saying it's purposely broken, that's more saying that Herbert didn't see a point in fixing a case that has awful behaviour already. However, I'm confused now - we can much more easily live with *insertion* failures, as the linked email indicates, than *deletion* failures, which I think you had indicated originally. Are you really seeing *deletion* failures? If there are in fact *deletion* failures then I think we really need to address those in rhashtable, no matter the worst-case behaviour of the hashing or keys, since we should be able to delete entries in order to get back to something reasonable. Looking at the code though, I don't actually see that happening. If you're seeing only *insertion* failures, which you indicated in the root of this thread, then I think for the general case in mac80211 we can live with that - we use a seeded jhash for the hash function, and since in the general case we cannot accept entries with identical MAC addresses to start with, it shouldn't be possible to run into this problem under normal use cases. I see insertion failure, and then later, if of course fails to delete as well since it was never inserted to begin with. There is no good way to deal with insertion error, so just need to fix the hashtable. In this case, I think perhaps you can just patch your local system with the many interfaces connecting to the same AP to add the parameter Herbert suggested (.insecure_elasticity = true in sta_rht_params). This is, after all, very much a case that "normal" operation doesn't even get close to. Old code, even stock kernels, could deal with this properly, so I think it should be fixed by default. I'll put rhash back in my tree and try that insecure option and see if it works. Thanks, Ben johannes -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/30/2016 09:38 AM, David Miller wrote: From: Johannes Berg <johan...@sipsolutions.net> Date: Wed, 30 Mar 2016 11:14:12 +0200 On Tue, 2016-03-29 at 09:16 -0700, Ben Greear wrote: Looks like rhashtable has too much policy in it to properly deal with cases where there are too many hash collisions, so I am going to work on reverting it's use in mac80211. I'm not really all that happy with that approach - can't we fix the rhashtable? It's a pretty rare corner case that many keys really are identical and no kind of hash algorithm, but it seems much better to still deal with it than to remove the rhashtable usage and go back to hand-rolling something. Yeah reverting seems like a really idiotic way to deal with the issue. If someone can fix rhashtable, then great. I read some earlier comments [1] back when someone else reported similar problems, and the comments seemed to indicate that rhashtable was broken in this manner on purpose to protect against hashing attacks. If you are baking in this type of policy to what should be a basic data-type, then it is not useful for how it is being used in the mac80211 stack. [1] http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
On 03/30/2016 09:38 AM, David Miller wrote: From: Johannes Berg Date: Wed, 30 Mar 2016 11:14:12 +0200 On Tue, 2016-03-29 at 09:16 -0700, Ben Greear wrote: Looks like rhashtable has too much policy in it to properly deal with cases where there are too many hash collisions, so I am going to work on reverting it's use in mac80211. I'm not really all that happy with that approach - can't we fix the rhashtable? It's a pretty rare corner case that many keys really are identical and no kind of hash algorithm, but it seems much better to still deal with it than to remove the rhashtable usage and go back to hand-rolling something. Yeah reverting seems like a really idiotic way to deal with the issue. If someone can fix rhashtable, then great. I read some earlier comments [1] back when someone else reported similar problems, and the comments seemed to indicate that rhashtable was broken in this manner on purpose to protect against hashing attacks. If you are baking in this type of policy to what should be a basic data-type, then it is not useful for how it is being used in the mac80211 stack. [1] http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
Looks like rhashtable has too much policy in it to properly deal with cases where there are too many hash collisions, so I am going to work on reverting it's use in mac80211. Thanks, Ben On 03/28/2016 01:29 PM, Ben Greear wrote: Hello! I have a use case for mac80211 where I create multiple stations to the same remote peer MAC address. I'm seeing cases where the rhashtable logic is returning -16 (EBUSY) on insert (see sta_info_hash_add). This is with the 4.4.6+ (plus local patches) kernel, and it has the patch mentioned here: https://lkml.org/lkml/2015/12/3/307 If I understand the code properly, my use case is going to be worst-case scenario, where all of my items in the hash have the same key (peer mac addr). I have my own secondary hash to handle most of my hot-path lookups, but I still need the main hash to at least function in a linear-search manner. Any idea what I can do to get rid of the EBUSY return code problem, or how to debug it further? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Re: Question on rhashtable in worst-case scenario.
Looks like rhashtable has too much policy in it to properly deal with cases where there are too many hash collisions, so I am going to work on reverting it's use in mac80211. Thanks, Ben On 03/28/2016 01:29 PM, Ben Greear wrote: Hello! I have a use case for mac80211 where I create multiple stations to the same remote peer MAC address. I'm seeing cases where the rhashtable logic is returning -16 (EBUSY) on insert (see sta_info_hash_add). This is with the 4.4.6+ (plus local patches) kernel, and it has the patch mentioned here: https://lkml.org/lkml/2015/12/3/307 If I understand the code properly, my use case is going to be worst-case scenario, where all of my items in the hash have the same key (peer mac addr). I have my own secondary hash to handle most of my hot-path lookups, but I still need the main hash to at least function in a linear-search manner. Any idea what I can do to get rid of the EBUSY return code problem, or how to debug it further? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Question on rhashtable in worst-case scenario.
Hello! I have a use case for mac80211 where I create multiple stations to the same remote peer MAC address. I'm seeing cases where the rhashtable logic is returning -16 (EBUSY) on insert (see sta_info_hash_add). This is with the 4.4.6+ (plus local patches) kernel, and it has the patch mentioned here: https://lkml.org/lkml/2015/12/3/307 If I understand the code properly, my use case is going to be worst-case scenario, where all of my items in the hash have the same key (peer mac addr). I have my own secondary hash to handle most of my hot-path lookups, but I still need the main hash to at least function in a linear-search manner. Any idea what I can do to get rid of the EBUSY return code problem, or how to debug it further? Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com
Question on rhashtable in worst-case scenario.
Hello! I have a use case for mac80211 where I create multiple stations to the same remote peer MAC address. I'm seeing cases where the rhashtable logic is returning -16 (EBUSY) on insert (see sta_info_hash_add). This is with the 4.4.6+ (plus local patches) kernel, and it has the patch mentioned here: https://lkml.org/lkml/2015/12/3/307 If I understand the code properly, my use case is going to be worst-case scenario, where all of my items in the hash have the same key (peer mac addr). I have my own secondary hash to handle most of my hot-path lookups, but I still need the main hash to at least function in a linear-search manner. Any idea what I can do to get rid of the EBUSY return code problem, or how to debug it further? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com
Re: 3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'
On 01/12/2015 12:55 PM, Guenter Roeck wrote: > On Mon, Jan 12, 2015 at 12:08:22PM -0800, Ben Greear wrote: >> >> Any idea what is wrong? >> >> -rc3 compiled ok, then I rebased just now, and get this: >> > My auto-builders are all happy, with no build or qemu failures. > Did you try "make mrproper" prior to rebuilding ? yes, and blew away .ccache too. But, I saw something weird like git suddenly got confused a bit earlier, so I will re-clone and try again in case the 'reset --hard HEAD' didn't clean up as much as I had hoped. Thanks, Ben > > Guenter > -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'
Any idea what is wrong? -rc3 compiled ok, then I rebased just now, and get this: cd /mnt/sda/home/greearb/git/linux-3.17.dev.y/drivers/net/wireless/ath/ath10k/ cd ~/kernel/2.6/linux-3.19.x64/; make -j8 bzImage modules CHK include/config/kernel.release GEN ./Makefile CHK include/generated/uapi/linux/version.h Using /mnt/sda/home/greearb/git/linux-3.19.dev.y as source for kernel CHK include/generated/utsrelease.h CC kernel/bounds.s GEN include/generated/bounds.h CC arch/x86/kernel/asm-offsets.s GEN include/generated/asm-offsets.h CALL/mnt/sda/home/greearb/git/linux-3.19.dev.y/scripts/checksyscalls.sh make[3]: *** No rule to make target `usr/missing-syscalls', needed by `__build'. Stop. make[3]: *** Waiting for unfinished jobs CC kernel/bounds.s CHK include/generated/compile.h CC init/main.o CC init/do_mounts.o CC init/do_mounts_rd.o /mnt/sda/home/greearb/git/linux-3.19.dev.y/kernel/Makefile:132: *** No X.509 certificates found *** CC init/do_mounts_initrd.o AS arch/x86/ia32/ia32entry.o make[2]: *** [usr] Error 2 make[2]: *** Waiting for unfinished jobs CC arch/x86/ia32/sys_ia32.o Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'
Any idea what is wrong? -rc3 compiled ok, then I rebased just now, and get this: cd /mnt/sda/home/greearb/git/linux-3.17.dev.y/drivers/net/wireless/ath/ath10k/ cd ~/kernel/2.6/linux-3.19.x64/; make -j8 bzImage modules CHK include/config/kernel.release GEN ./Makefile CHK include/generated/uapi/linux/version.h Using /mnt/sda/home/greearb/git/linux-3.19.dev.y as source for kernel CHK include/generated/utsrelease.h CC kernel/bounds.s GEN include/generated/bounds.h CC arch/x86/kernel/asm-offsets.s GEN include/generated/asm-offsets.h CALL/mnt/sda/home/greearb/git/linux-3.19.dev.y/scripts/checksyscalls.sh make[3]: *** No rule to make target `usr/missing-syscalls', needed by `__build'. Stop. make[3]: *** Waiting for unfinished jobs CC kernel/bounds.s CHK include/generated/compile.h CC init/main.o CC init/do_mounts.o CC init/do_mounts_rd.o /mnt/sda/home/greearb/git/linux-3.19.dev.y/kernel/Makefile:132: *** No X.509 certificates found *** CC init/do_mounts_initrd.o AS arch/x86/ia32/ia32entry.o make[2]: *** [usr] Error 2 make[2]: *** Waiting for unfinished jobs CC arch/x86/ia32/sys_ia32.o Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'
On 01/12/2015 12:55 PM, Guenter Roeck wrote: On Mon, Jan 12, 2015 at 12:08:22PM -0800, Ben Greear wrote: Any idea what is wrong? -rc3 compiled ok, then I rebased just now, and get this: My auto-builders are all happy, with no build or qemu failures. Did you try make mrproper prior to rebuilding ? yes, and blew away .ccache too. But, I saw something weird like git suddenly got confused a bit earlier, so I will re-clone and try again in case the 'reset --hard HEAD' didn't clean up as much as I had hoped. Thanks, Ben Guenter -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel splat in clean 3.16, related to drm/i915
I thought I'd do some testing on a stock kernel for a change...saw this during bootup one one of my systems: [4.854068] [ cut here ] [4.854154] WARNING: CPU: 0 PID: 93 at /home/greearb/git/linux-2.6.clean/drivers/gpu/drm/i915/intel_panel.c:758 i9xx_enable_backlight+0xd4/0x100 [i915]() [4.854156] backlight already enabled [4.854166] Modules linked in: ata_generic pata_acpi i915(+) i2c_algo_bit video drm_kms_helper drm [4.854172] CPU: 0 PID: 93 Comm: systemd-udevd Not tainted 3.16.0 #1 [4.854175] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 05/31/2010 [4.854183] f759b730 c0a228f4 f759b770 f759b760 c045d24e f8616932 [4.854190] f759b78c 005d f8610188 02f6 f85e8124 f85e8124 f77e f77d8400 [4.854197] f75db400 f759b778 c045d2a3 0009 f759b770 f8616932 f759b78c f759b798 [4.854199] Call Trace: [4.854211] [] dump_stack+0x48/0x69 [4.854218] [] warn_slowpath_common+0x7e/0xa0 [4.854288] [] ? i9xx_enable_backlight+0xd4/0x100 [i915] [4.854356] [] ? i9xx_enable_backlight+0xd4/0x100 [i915] [4.854361] [] warn_slowpath_fmt+0x33/0x40 [4.854430] [] i9xx_enable_backlight+0xd4/0x100 [i915] [4.854500] [] intel_panel_enable_backlight+0x77/0xc0 [i915] [4.854568] [] intel_enable_lvds+0x13f/0x150 [i915] [4.854632] [] i9xx_crtc_enable+0x36d/0x400 [i915] [4.854696] [] __intel_set_mode+0x6cd/0x1400 [i915] [4.854761] [] ? intel_modeset_check_state+0x20b/0x740 [i915] [4.854825] [] intel_set_mode+0x23/0x40 [i915] [4.854889] [] intel_crtc_set_config+0x7c4/0xba0 [i915] [4.854916] [] drm_mode_set_config_internal+0x4e/0xb0 [drm] [4.854979] [] ? intel_user_framebuffer_destroy+0xb0/0xb0 [i915] [4.854989] [] restore_fbdev_mode+0xaa/0xe0 [drm_kms_helper] [4.854998] [] drm_fb_helper_restore_fbdev_mode_unlocked+0x1d/0x30 [drm_kms_helper] [4.855007] [] drm_fb_helper_set_par+0x26/0x60 [drm_kms_helper] [4.855014] [] fbcon_init+0x46a/0x4b0 [4.855021] [] visual_init+0x9e/0x100 [4.855026] [] do_bind_con_driver+0x10a/0x2c0 [4.855032] [] ? sysfs_create_file_ns+0x2c/0x30 [4.855038] [] do_take_over_console+0xfd/0x190 [4.855043] [] do_fbcon_takeover+0x5f/0xc0 [4.855048] [] fbcon_event_notify+0x5ff/0x740 [4.855049] [] notifier_call_chain+0x41/0x60 [4.855049] [] __blocking_notifier_call_chain+0x3b/0x60 [4.855049] [] blocking_notifier_call_chain+0x1f/0x30 [4.855049] [] fb_notifier_call_chain+0x16/0x20 [4.855049] [] register_framebuffer+0x1c7/0x2e0 [4.855049] [] drm_fb_helper_initial_config+0x2d4/0x1020 [drm_kms_helper] [4.855049] [] ? gen4_write32+0x49/0xe0 [i915] [4.855049] [] ? gen4_read32+0x3f/0xd0 [i915] [4.855049] [] ? intel_fbdev_init+0x193/0x510 [i915] [4.855049] [] intel_fbdev_initial_config+0x1c/0x20 [i915] [4.855049] [] i915_driver_load+0xe48/0xed0 [i915] [4.855049] [] ? i915_switcheroo_set_state+0x90/0x90 [i915] [4.855049] [] ? kobject_uevent_env+0x15a/0x690 [4.855049] [] ? cleanup_uevent_env+0x10/0x10 [4.855049] [] ? get_device+0x14/0x30 [4.855049] [] ? klist_class_dev_get+0x12/0x20 [4.855049] [] ? klist_node_init+0x2e/0x50 [4.855049] [] ? klist_add_tail+0x34/0x40 [4.855049] [] ? device_add+0x1b6/0x5c0 [4.855049] [] ? kobject_set_name_vargs+0x42/0x60 [4.855049] [] ? drm_sysfs_device_add+0xba/0x100 [drm] [4.855049] [] drm_dev_register+0x8e/0xe0 [drm] [4.855049] [] drm_get_pci_dev+0x79/0x1c0 [drm] [4.855049] [] i915_pci_probe+0x35/0x60 [i915] [4.855049] [] pci_device_probe+0x6f/0xc0 [4.855049] [] ? sysfs_create_link+0x25/0x40 [4.855049] [] driver_probe_device+0x93/0x3a0 [4.855049] [] ? pci_match_device+0xca/0x100 [4.855049] [] __driver_attach+0x71/0x80 [4.855049] [] ? __device_attach+0x40/0x40 [4.855049] [] bus_for_each_dev+0x47/0x80 [4.855049] [] driver_attach+0x1e/0x20 [4.855049] [] ? __device_attach+0x40/0x40 [4.855049] [] bus_add_driver+0x157/0x230 [4.855049] [] ? 0xf824afff [4.855049] [] ? 0xf824afff [4.855049] [] driver_register+0x59/0xe0 [4.855049] [] ? soft_offline_page+0x3bf/0x540 [4.855049] [] __pci_register_driver+0x32/0x40 [4.855049] [] drm_pci_init+0xe5/0x110 [drm] [4.855049] [] ? 0xf824afff [4.855049] [] i915_init+0x88/0x8a [i915] [4.855049] [] ? 0xf824afff [4.855049] [] do_one_initcall+0xaa/0x1e0 [4.855049] [] ? 0xf824afff [4.855049] [] ? kfree+0x11f/0x150 [4.855049] [] ? __vunmap+0x8f/0xe0 [4.855049] [] ? __vunmap+0x8f/0xe0 [4.855049] [] ? __vunmap+0x8f/0xe0 [4.855049] [] load_module+0x1c9f/0x21d0 [4.855049] [] SyS_finit_module+0x75/0xc0 [4.855049] [] ? vm_mmap_pgoff+0x7b/0xa0 [4.855049] [] sysenter_do_call+0x12/0x12 [4.855049] ---[ end trace 9ef1310c3c12d97e ]--- Thanks, Ben -- Ben
kernel splat in clean 3.16, related to drm/i915
[i915] [4.855049] [f824b000] ? 0xf824afff [4.855049] [c040211a] do_one_initcall+0xaa/0x1e0 [4.855049] [f824b000] ? 0xf824afff [4.855049] [c056682f] ? kfree+0x11f/0x150 [4.855049] [c0555c3f] ? __vunmap+0x8f/0xe0 [4.855049] [c0555c3f] ? __vunmap+0x8f/0xe0 [4.855049] [c0555c3f] ? __vunmap+0x8f/0xe0 [4.855049] [c04cf48f] load_module+0x1c9f/0x21d0 [4.855049] [c04cfb25] SyS_finit_module+0x75/0xc0 [4.855049] [c053bbfb] ? vm_mmap_pgoff+0x7b/0xa0 [4.855049] [c0a2964c] sysenter_do_call+0x12/0x12 [4.855049] ---[ end trace 9ef1310c3c12d97e ]--- Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Crashes in 3.14.13+ related to anon_vma_clone.
On 07/24/2014 04:08 PM, Ben Greear wrote: > A few of our systems are repeatedly crashing when upgraded from > the 3.14.6+ to 3.14.13+ kernels. Both kernels have a fair bit > of our out-of-tree patches, so could be our fault. Ahh, looks like a bad merge of a stable patch with a slightly different manually applied stable patch, causing double-free in the wifi stack... Sorry for the noise. Thanks, Ben > > But, in case this is a known problem...has anyone seen crashes like > this below? > > Test case is using some ath9k APs and stations, but not certain > that has much to do with the problem. > > BUG: unable to handle kernel paging request at 003f9840b000 > IP: [] anon_vma_clone+0x88/0xf5 > PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065 > Oops: 0003 [#1] PREEMPT SMP > Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen > ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo > CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19 > Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by > O.E.M., BIOS 4.6.3 03/06/2012 > task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000 > RIP: 0010:[] [] anon_vma_clone+0x88/0xf5 > RSP: 0018:8800baa3fda0 EFLAGS: 00010246 > RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101 > RDX: RSI: 8800ba8b8380 RDI: 8800ba8b8388 > RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000 > R10: 000b R11: 00016338 R12: > R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380 > FS: 7f0b70511740() GS:88022bc8() knlGS: > CS: 0010 DS: ES: CR0: 8005003b > CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0 > Stack: > 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228 > 8800bc609450 880221ba9228 8800baa3fe08 81169b7b > 88020dcfc980 880211db5f80 880221ba9228 8800bc609450 > Call Trace: > [] anon_vma_fork+0x26/0xec > [] copy_process.part.28+0xf98/0x16ca > [] do_fork+0xb4/0x211 > [] ? __set_task_blocked+0x5e/0x64 > [] ? _raw_spin_unlock_irq+0xc/0x1f > [] SyS_clone+0x11/0x13 > [] stub_clone+0x69/0x90 > [] ? system_call_fastpath+0x1a/0x1f > Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 > 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff > RIP [] anon_vma_clone+0x88/0xf5 > RSP > CR2: 003f9840b000 > -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Crashes in 3.14.13+ related to anon_vma_clone.
A few of our systems are repeatedly crashing when upgraded from the 3.14.6+ to 3.14.13+ kernels. Both kernels have a fair bit of our out-of-tree patches, so could be our fault. But, in case this is a known problem...has anyone seen crashes like this below? Test case is using some ath9k APs and stations, but not certain that has much to do with the problem. BUG: unable to handle kernel paging request at 003f9840b000 IP: [] anon_vma_clone+0x88/0xf5 PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065 Oops: 0003 [#1] PREEMPT SMP Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19 Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS 4.6.3 03/06/2012 task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000 RIP: 0010:[] [] anon_vma_clone+0x88/0xf5 RSP: 0018:8800baa3fda0 EFLAGS: 00010246 RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101 RDX: RSI: 8800ba8b8380 RDI: 8800ba8b8388 RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000 R10: 000b R11: 00016338 R12: R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380 FS: 7f0b70511740() GS:88022bc8() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0 Stack: 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228 8800bc609450 880221ba9228 8800baa3fe08 81169b7b 88020dcfc980 880211db5f80 880221ba9228 8800bc609450 Call Trace: [] anon_vma_fork+0x26/0xec [] copy_process.part.28+0xf98/0x16ca [] do_fork+0xb4/0x211 [] ? __set_task_blocked+0x5e/0x64 [] ? _raw_spin_unlock_irq+0xc/0x1f [] SyS_clone+0x11/0x13 [] stub_clone+0x69/0x90 [] ? system_call_fastpath+0x1a/0x1f Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff RIP [] anon_vma_clone+0x88/0xf5 RSP CR2: 003f9840b000 -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Crashes in 3.14.13+ related to anon_vma_clone.
A few of our systems are repeatedly crashing when upgraded from the 3.14.6+ to 3.14.13+ kernels. Both kernels have a fair bit of our out-of-tree patches, so could be our fault. But, in case this is a known problem...has anyone seen crashes like this below? Test case is using some ath9k APs and stations, but not certain that has much to do with the problem. BUG: unable to handle kernel paging request at 003f9840b000 IP: [81169ae8] anon_vma_clone+0x88/0xf5 PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065 Oops: 0003 [#1] PREEMPT SMP Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19 Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS 4.6.3 03/06/2012 task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000 RIP: 0010:[81169ae8] [81169ae8] anon_vma_clone+0x88/0xf5 RSP: 0018:8800baa3fda0 EFLAGS: 00010246 RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101 RDX: RSI: 8800ba8b8380 RDI: 8800ba8b8388 RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000 R10: 000b R11: 00016338 R12: R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380 FS: 7f0b70511740() GS:88022bc8() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0 Stack: 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228 8800bc609450 880221ba9228 8800baa3fe08 81169b7b 88020dcfc980 880211db5f80 880221ba9228 8800bc609450 Call Trace: [81169b7b] anon_vma_fork+0x26/0xec [810c0070] copy_process.part.28+0xf98/0x16ca [810c091d] do_fork+0xb4/0x211 [810cbe9b] ? __set_task_blocked+0x5e/0x64 [815a1b9f] ? _raw_spin_unlock_irq+0xc/0x1f [810c0ae0] SyS_clone+0x11/0x13 [815a6a59] stub_clone+0x69/0x90 [815a66fd] ? system_call_fastpath+0x1a/0x1f Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff RIP [81169ae8] anon_vma_clone+0x88/0xf5 RSP 8800baa3fda0 CR2: 003f9840b000 -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Crashes in 3.14.13+ related to anon_vma_clone.
On 07/24/2014 04:08 PM, Ben Greear wrote: A few of our systems are repeatedly crashing when upgraded from the 3.14.6+ to 3.14.13+ kernels. Both kernels have a fair bit of our out-of-tree patches, so could be our fault. Ahh, looks like a bad merge of a stable patch with a slightly different manually applied stable patch, causing double-free in the wifi stack... Sorry for the noise. Thanks, Ben But, in case this is a known problem...has anyone seen crashes like this below? Test case is using some ath9k APs and stations, but not certain that has much to do with the problem. BUG: unable to handle kernel paging request at 003f9840b000 IP: [81169ae8] anon_vma_clone+0x88/0xf5 PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065 Oops: 0003 [#1] PREEMPT SMP Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19 Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS 4.6.3 03/06/2012 task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000 RIP: 0010:[81169ae8] [81169ae8] anon_vma_clone+0x88/0xf5 RSP: 0018:8800baa3fda0 EFLAGS: 00010246 RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101 RDX: RSI: 8800ba8b8380 RDI: 8800ba8b8388 RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000 R10: 000b R11: 00016338 R12: R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380 FS: 7f0b70511740() GS:88022bc8() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0 Stack: 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228 8800bc609450 880221ba9228 8800baa3fe08 81169b7b 88020dcfc980 880211db5f80 880221ba9228 8800bc609450 Call Trace: [81169b7b] anon_vma_fork+0x26/0xec [810c0070] copy_process.part.28+0xf98/0x16ca [810c091d] do_fork+0xb4/0x211 [810cbe9b] ? __set_task_blocked+0x5e/0x64 [815a1b9f] ? _raw_spin_unlock_irq+0xc/0x1f [810c0ae0] SyS_clone+0x11/0x13 [815a6a59] stub_clone+0x69/0x90 [815a66fd] ? system_call_fastpath+0x1a/0x1f Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff RIP [81169ae8] anon_vma_clone+0x88/0xf5 RSP 8800baa3fda0 CR2: 003f9840b000 -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ath10k: fix rssi reporting.
Sorry, ...this and previous patch should not have gone to LKML. Will send it over to ath10k list where it was supposed to go. Thanks, Ben On 05/15/2014 11:31 AM, gree...@candelatech.com wrote: > From: Ben Greear > > When the driver cannot provide proper rssi, mark > status with RX_FLAG_NO_SIGNAL_VAL so that stack > properly ignores it. > > Signed-off-by: Ben Greear > --- > > Patch is against my tree, hopefully will apply OK against upstream > but have not tested that yet. > > drivers/net/wireless/ath/ath10k/htt_rx.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c > b/drivers/net/wireless/ath/ath10k/htt_rx.c > index 0c83ffb..d8ec8dd 100644 > --- a/drivers/net/wireless/ath/ath10k/htt_rx.c > +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c > @@ -1216,11 +1216,12 @@ static void ath10k_htt_rx_handler(struct ath10k_htt > *htt, > mpdu_ranges = htt_rx_ind_get_mpdu_ranges(rx); > > /* Fill this once, while this is per-ppdu */ > - if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_START_VALID) { > - memset(rx_status, 0, sizeof(*rx_status)); > + memset(rx_status, 0, sizeof(*rx_status)); > + if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_START_VALID) > rx_status->signal = ATH10K_DEFAULT_NOISE_FLOOR + >rx->ppdu.combined_rssi; > - } > + else > + rx_status->flag |= RX_FLAG_NO_SIGNAL_VAL; > > if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_END_VALID) { > /* TSF available only in 32-bit */ > -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ath10k: fix rssi reporting.
Sorry, ...this and previous patch should not have gone to LKML. Will send it over to ath10k list where it was supposed to go. Thanks, Ben On 05/15/2014 11:31 AM, gree...@candelatech.com wrote: From: Ben Greear gree...@candelatech.com When the driver cannot provide proper rssi, mark status with RX_FLAG_NO_SIGNAL_VAL so that stack properly ignores it. Signed-off-by: Ben Greear gree...@candelatech.com --- Patch is against my tree, hopefully will apply OK against upstream but have not tested that yet. drivers/net/wireless/ath/ath10k/htt_rx.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 0c83ffb..d8ec8dd 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -1216,11 +1216,12 @@ static void ath10k_htt_rx_handler(struct ath10k_htt *htt, mpdu_ranges = htt_rx_ind_get_mpdu_ranges(rx); /* Fill this once, while this is per-ppdu */ - if (rx-ppdu.info0 HTT_RX_INDICATION_INFO0_START_VALID) { - memset(rx_status, 0, sizeof(*rx_status)); + memset(rx_status, 0, sizeof(*rx_status)); + if (rx-ppdu.info0 HTT_RX_INDICATION_INFO0_START_VALID) rx_status-signal = ATH10K_DEFAULT_NOISE_FLOOR + rx-ppdu.combined_rssi; - } + else + rx_status-flag |= RX_FLAG_NO_SIGNAL_VAL; if (rx-ppdu.info0 HTT_RX_INDICATION_INFO0_END_VALID) { /* TSF available only in 32-bit */ -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] netdev: pktgen xmit packet through vlan interface
On 05/05/2014 08:12 AM, Jesper Dangaard Brouer wrote: > On Fri, 02 May 2014 07:00:00 -0700 > John Fastabend wrote: > >> On 5/2/2014 6:19 AM, Jesper Dangaard Brouer wrote: >>> On Fri, 2 May 2014 15:18:12 +0800 >>> Zhouyi Zhou wrote: >>> >>>> As http://www.spinics.net/lists/netdev/msg165015.html >>>> pktgen generates shared packet through vlan interface will cause >>>> oops because of duplicate entering tc queue. >>>> >>>> Try to solve this problem by means of packet clone instead of sharing. >>> >>> I really don't like adding this stuff to the fast path of pktgen. >>> >>> Why would you use pktgen on a VLAN? >> >> Its a good way to test qdiscs. When you run pktgen over the VLAN >> you exercise the lower devices qdisc. > > I do (personally) need a faster way/tool to exercise the qdisc path. > I'm currently using trafgen, but it is not fast enough for my 10G > testing. > > Perhaps we could add a pktgen option, that explicitly enable > transmitting on qdisc path. And when adding a VLAN device, auto enable > that mode? You could just force pktgen to not support multi-skb on vlan interfaces? I thought we went through this a year or two ago and came up with something like a 'pktgen-challenged' network interface flag? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] netdev: pktgen xmit packet through vlan interface
On 05/05/2014 08:12 AM, Jesper Dangaard Brouer wrote: On Fri, 02 May 2014 07:00:00 -0700 John Fastabend john.r.fastab...@intel.com wrote: On 5/2/2014 6:19 AM, Jesper Dangaard Brouer wrote: On Fri, 2 May 2014 15:18:12 +0800 Zhouyi Zhou zhouzho...@gmail.com wrote: As http://www.spinics.net/lists/netdev/msg165015.html pktgen generates shared packet through vlan interface will cause oops because of duplicate entering tc queue. Try to solve this problem by means of packet clone instead of sharing. I really don't like adding this stuff to the fast path of pktgen. Why would you use pktgen on a VLAN? Its a good way to test qdiscs. When you run pktgen over the VLAN you exercise the lower devices qdisc. I do (personally) need a faster way/tool to exercise the qdisc path. I'm currently using trafgen, but it is not fast enough for my 10G testing. Perhaps we could add a pktgen option, that explicitly enable transmitting on qdisc path. And when adding a VLAN device, auto enable that mode? You could just force pktgen to not support multi-skb on vlan interfaces? I thought we went through this a year or two ago and came up with something like a 'pktgen-challenged' network interface flag? Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Build failure related to 'spinlokk_types.h'
On 02/26/2014 11:50 AM, Ben Greear wrote: > This is from the ath10k tree, which was recently rebased on top of > 3.14.0-rc4. > > I'm getting the error below, but I cannot find any reference to 'spinlokk' > in the source tree. The build tree has two mentions, but these are > auto-generated > from what I can tell: > > [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk . > ./arch/x86/kernel/cpu/.common.o.tmp: > /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ > ./arch/x86/kernel/cpu/.common.o.d: > /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ > [greearb@ben-dt2 linux-ath.x64]$ Looks like it was a .ccache issue...I removed it and re-built and it was fine... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Build failure related to 'spinlokk_types.h'
This is from the ath10k tree, which was recently rebased on top of 3.14.0-rc4. I'm getting the error below, but I cannot find any reference to 'spinlokk' in the source tree. The build tree has two mentions, but these are auto-generated from what I can tell: [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk . ./arch/x86/kernel/cpu/.common.o.tmp: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ ./arch/x86/kernel/cpu/.common.o.d: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ [greearb@ben-dt2 linux-ath.x64]$ I tried make distclean, but that did not help. Any idea where this might be coming from? [greearb@ben-dt2 linux-ath.x64]$ make bzImage modules make[3]: Nothing to be done for `all'. make[3]: Nothing to be done for `relocs'. CHK include/config/kernel.release Using /mnt/sda/home/greearb/git/linux.ath as source for kernel GEN /home/greearb/kernel/2.6/linux-ath.x64/Makefile CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CALL/mnt/sda/home/greearb/git/linux.ath/scripts/checksyscalls.sh CHK include/generated/compile.h CC arch/x86/kernel/cpu/common.o fixdep: error opening config file: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h: No such file or directory make[5]: *** [arch/x86/kernel/cpu/common.o] Error 2 make[4]: *** [arch/x86/kernel/cpu] Error 2 make[3]: *** [arch/x86/kernel] Error 2 make[2]: *** [arch/x86] Error 2 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Build failure related to 'spinlokk_types.h'
This is from the ath10k tree, which was recently rebased on top of 3.14.0-rc4. I'm getting the error below, but I cannot find any reference to 'spinlokk' in the source tree. The build tree has two mentions, but these are auto-generated from what I can tell: [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk . ./arch/x86/kernel/cpu/.common.o.tmp: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ ./arch/x86/kernel/cpu/.common.o.d: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ [greearb@ben-dt2 linux-ath.x64]$ I tried make distclean, but that did not help. Any idea where this might be coming from? [greearb@ben-dt2 linux-ath.x64]$ make bzImage modules make[3]: Nothing to be done for `all'. make[3]: Nothing to be done for `relocs'. CHK include/config/kernel.release Using /mnt/sda/home/greearb/git/linux.ath as source for kernel GEN /home/greearb/kernel/2.6/linux-ath.x64/Makefile CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CALL/mnt/sda/home/greearb/git/linux.ath/scripts/checksyscalls.sh CHK include/generated/compile.h CC arch/x86/kernel/cpu/common.o fixdep: error opening config file: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h: No such file or directory make[5]: *** [arch/x86/kernel/cpu/common.o] Error 2 make[4]: *** [arch/x86/kernel/cpu] Error 2 make[3]: *** [arch/x86/kernel] Error 2 make[2]: *** [arch/x86] Error 2 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Build failure related to 'spinlokk_types.h'
On 02/26/2014 11:50 AM, Ben Greear wrote: This is from the ath10k tree, which was recently rebased on top of 3.14.0-rc4. I'm getting the error below, but I cannot find any reference to 'spinlokk' in the source tree. The build tree has two mentions, but these are auto-generated from what I can tell: [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk . ./arch/x86/kernel/cpu/.common.o.tmp: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ ./arch/x86/kernel/cpu/.common.o.d: /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \ [greearb@ben-dt2 linux-ath.x64]$ Looks like it was a .ccache issue...I removed it and re-built and it was fine... Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/11] use ether_addr_equal_64bits
On 12/31/2013 08:09 AM, Julia Lawall wrote: On Tue, 31 Dec 2013, Ben Greear wrote: On 12/30/2013 10:32 PM, Julia Lawall wrote: I'm just thinking of a programmer, e.g. changing a struct like this: struct foo { u8 addr[ETH_ALEN]; - u16 dummy; }; I don't know of a way to catch that. Anyone else? Well, one could have a semantic patch that checks for that. But the problem is that it is very slow, and it only covers the cases that I can transform automatically, which currently means no pointers, only explicit arrays. On the other hand, I am finding the structure definition, so I can easily update the structure definition with an appropriate comment. struct foo { u8 addr[ETH_ALEN]; /* must be followed by two bytes in the structure */ u16 dummy; }; Unfortunately it is kind of verbose. Could there be an attribute? That could even easily be checked. Can you not just add a build-time macro to check that sizeof(foo) >= 8 for each of these struct foos? Or, is it required that the dummy field be there and be not used by anything else? It doesn't matter what the field is used for. The problem is that is it necessary to ensure a property of the position of addr within the structure. It has to have at least 16 bytes after it. You mean 16 bits? But maybe something with sizeof(foo) and offset_of would do? Could the macro be put near the declaration of the structure somehow? I think that would work, but do not know all of the details of such macros, so it's possible there is some catch. If nothing else, then some run-time code that calculates the offset off and asserts if it is broken in module initialization or similar might be good enough. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/