from:"Ben Greear"

Re: [RFC 2/7] ath10k: Add support to process rx packet in thread

2021-03-22 Thread Ben Greear


On 3/22/21 6:20 PM, Brian Norris wrote:

On Mon, Mar 22, 2021 at 4:58 PM Ben Greear  wrote:

On 7/22/20 6:00 AM, Felix Fietkau wrote:

On 2020-07-22 14:55, Johannes Berg wrote:

On Wed, 2020-07-22 at 14:27 +0200, Felix Fietkau wrote:


I'm considering testing a different approach (with mt76 initially):
- Add a mac80211 rx function that puts processed skbs into a list
instead of handing them to the network stack directly.


Would this be *after* all the mac80211 processing, i.e. in place of the
rx-up-to-stack?

Yes, it would run all the rx handlers normally and then put the
resulting skbs into a list instead of calling netif_receive_skb or
napi_gro_frags.


Whatever came of this?  I realized I'm running Felix's patch since his mt76
driver needs it.  Any chance it will go upstream?


If you're asking about $subject (moving NAPI/RX to a thread), this
landed upstream recently:
http://git.kernel.org/linus/adbb4fb028452b1b0488a1a7b66ab856cdf20715

It needs a bit of coaxing to work on a WiFi driver (including: WiFi
drivers tend to have a different netdev for NAPI than they expose to
/sys/class/net/), but it's there.

I'm not sure if people had something else in mind in the stuff you're
quoting though.


No, I got it confused with something Felix did:

https://github.com/greearb/mt76/blob/master/patches/0001-net-add-support-for-threaded-NAPI-polling.patch

Maybe the NAPI/RX to a thread thing superceded Felix's patch?

Thanks,
Ben



Brian




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [RFC 2/7] ath10k: Add support to process rx packet in thread

2021-03-22 Thread Ben Greear


On 7/22/20 6:00 AM, Felix Fietkau wrote:

On 2020-07-22 14:55, Johannes Berg wrote:

On Wed, 2020-07-22 at 14:27 +0200, Felix Fietkau wrote:


I'm considering testing a different approach (with mt76 initially):
- Add a mac80211 rx function that puts processed skbs into a list
instead of handing them to the network stack directly.


Would this be *after* all the mac80211 processing, i.e. in place of the
rx-up-to-stack?

Yes, it would run all the rx handlers normally and then put the
resulting skbs into a list instead of calling netif_receive_skb or
napi_gro_frags.


Whatever came of this?  I realized I'm running Felix's patch since his mt76
driver needs it.  Any chance it will go upstream?

Thanks,
Ben



- Felix




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 0/3] mac80211: Trigger disconnect for STA during recovery

2020-12-17 Thread Ben Greear


On 12/17/20 2:24 PM, Brian Norris wrote:

On Tue, Dec 15, 2020 at 10:23:33AM -0800, Ben Greear wrote:

On 12/15/20 9:21 AM, Youghandhar Chintala wrote:

From: Rakesh Pillai 

Currently in case of target hardware restart ,we just reconfig and
re-enable the security keys and enable the network queues to start
data traffic back from where it was interrupted.


Are there any known mac80211 radios/drivers that *can* support seamless 
restarts?

If not, then just could always enable this feature in mac80211?


I'm quite sure that iwlwifi intentionally supports a seamless restart.
 From my experience with dealing with user reports, I don't recall any
issues where restart didn't function as expected, unless there was some
deeper underlying failure (e.g., hardware/power failure; driver bugs /
lockups).

I don't have very good stats for ath10k/QCA6174, but it survives
our testing OK and I again don't recall any user-reported complaints in
this area. I'd say this is a weaker example though, as I don't have as
clear of data. (By contrast, ath10k/WCN399x, which Rakesh, et al, are
patching here, does not pass our tests at all, and clearly fails to
recover from "seamless" restarts, as noted in patch 3.)

I'd also note that we don't operate in AP mode -- only STA -- and IIRC
Ben, you've complained about AP mode in the past.


I complain about all sorts of things, but I'm usually running
station mode :)

Do you actually see iwlwifi stations stay associated through
firmware crashes?

Anyway, happy to hear some have seamless recovery, and in that case,
I have no objections to the patch.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 0/3] mac80211: Trigger disconnect for STA during recovery

2020-12-15 Thread Ben Greear


On 12/15/20 9:21 AM, Youghandhar Chintala wrote:

From: Rakesh Pillai 

Currently in case of target hardware restart ,we just reconfig and
re-enable the security keys and enable the network queues to start
data traffic back from where it was interrupted.


Are there any known mac80211 radios/drivers that *can* support seamless 
restarts?

If not, then just could always enable this feature in mac80211?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

BUG: Invalid wait context, related to serial console?

2020-09-22 Thread Ben Greear


I saw this booting yesterday's 5.9.0-rc6+.  I have one small patch applied, 
quite unlikely
to cause any problems I think.  Platform is apu2, fedora-27 linux, with serial 
console attached.


[0.624831] =
[0.624831] [ BUG: Invalid wait context ]
[0.624831] 5.9.0-rc6+ #1 Not tainted
[0.624831] -
[0.624831] swapper/1/0 is trying to lock:
[0.624831] 85523218 (_lock_key){}-{3:3}, at: 
serial8250_console_write+0xfd/0x520
[0.624831] other info that might help us debug this:
[0.624831] context-{2:2}
[0.624831] 2 locks held by swapper/1/0:
[0.624831]  #0: 83338a60 (console_lock){+.+.}-{0:0}, at: 
printk+0xad/0xde
[0.624831]  #1: 83338820 (console_owner){-...}-{0:0}, at: 
console_unlock+0x291/0x840
[0.624831] stack backtrace:
[0.624831] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc6+ #1
[0.624831] Hardware name: PC Engines APU2/APU2, BIOS 4.0.7 02/28/2017
[0.624831] Call Trace:
[0.624831]  dump_stack+0xae/0xe8
[0.624831]  __lock_acquire.cold+0x1b9/0x34d
[0.624831]  ? save_trace+0x345/0x440
[0.624831]  ? lockdep_hardirqs_on_prepare+0x260/0x260
[0.624831]  ? mark_lock+0x90/0xb00
[0.624831]  lock_acquire+0x14f/0x5f0
[0.624831]  ? serial8250_console_write+0xfd/0x520
[0.624831]  ? lock_release+0x440/0x440
[0.624831]  ? lockdep_hardirqs_on_prepare+0x260/0x260
[0.624831]  ? __lock_acquire+0x85c/0x2f10
[0.624831]  _raw_spin_lock_irqsave+0x43/0x60
[0.624831]  ? serial8250_console_write+0xfd/0x520
[0.624831]  serial8250_console_write+0xfd/0x520
[0.624831]  ? serial8250_config_port+0x1630/0x1630
[0.624831]  ? lock_downgrade+0x3a0/0x3a0
[0.624831]  ? do_raw_spin_lock+0x114/0x1a0
[0.624831]  ? rwlock_bug.part.0+0x50/0x50
[0.624831]  console_unlock+0x582/0x840
[0.624831]  ? devkmsg_open+0x170/0x170
[0.624831]  ? do_raw_spin_unlock+0x8e/0xe0
[0.624831]  ? printk+0xad/0xde
[0.624831]  vprintk_emit+0x165/0x3e0
[0.624831]  printk+0xad/0xde
[0.624831]  ? kmsg_dump_rewind_nolock+0x54/0x54
[0.624831]  ? lock_acquire+0x14f/0x5f0
[0.624831]  ? do_raw_spin_unlock+0x8e/0xe0
[0.624831]  ? do_raw_spin_unlock+0x7f/0xe0
[0.624831]  common_interrupt+0x16c/0x180
[0.624831]  asm_common_interrupt+0x1e/0x40
[0.624831] RIP: 0010:start_secondary+0x1b6/0x230
[0.624831] Code: 65 8b 3d 1d 97 f7 7e e8 48 f0 14 00 48 c7 c7 d0 d4 04 83 e8 4c 15 45 00 ff 15 76 d8 fa 01 e8 31 6f 2b 00 fb 66 0f 1f 44 00 00 <48> c7 c7 40 
33 29 83 e8 2e 15 45 00 ff 15 c8 36 1f 02 0f ae f8 bf

[0.624831] RSP: :8880cda3fed8 EFLAGS: 0202
[0.624831] RAX: 0001 RBX: 0001 RCX: dc00
[0.624831] RDX:  RSI:  RDI: 8109fc5f
[0.624831] RBP: 111019b47fdb R08: 0001 R09: 0001
[0.624831] R10: ed101a494179 R11: 0001 R12: 000f3aac
[0.624831] R13:  R14:  R15: 
[0.624831]  ? start_secondary+0x1af/0x230
[0.624831]  ? start_secondary+0x1af/0x230
[0.624831]  ? set_cpu_sibling_map+0xc70/0xc70
[0.624831]  secondary_startup_64+0xa4/0xb0
[2.531065]  #2
[0.624831] __common_interrupt: 2.55 No irq handler for vector
[2.542389]  #3
[0.624831] __common_interrupt: 3.55 No irq handler for vector


Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Build failure for today's tree, Fedora-32

2020-09-21 Thread Ben Greear


On 9/21/20 12:50 PM, Ben Greear wrote:

Hello,

I'm seeing this build failure, any idea what is the issue?  I've cleared my 
ccache (ccache --clear)
but that did not help.  A pull from this morning builds on Fedora-29 with same 
.config file.  I tried
that same commit on my F32 system to no success, so it seems like it might be 
something with
my F32 compiler/system.


Looks like my git tree was corrupted.  A fresh one is working better.

Thanks,
Ben

Build failure for today's tree, Fedora-32

2020-09-21 Thread Ben Greear

5:5: error: conflicting 
types for ‘copy_siginfo_to_user’
   35 | int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t 
*from);
  | ^~~~
In file included from 
/home/greearb/git/linux-linus/arch/x86/include/uapi/asm/siginfo.h:13,
 from 
/home/greearb/git/linux-linus/include/uapi/linux/signal.h:6,
 from 
/home/greearb/git/linux-linus/include/linux/signal_types.h:10,
 from /home/greearb/git/linux-linus/include/linux/sched.h:30,
 from /home/greearb/git/linux-linus/include/linux/uaccess.h:6,
 from 
/home/greearb/git/linux-linus/arch/x86/include/asm/fpu/xstate.h:5,
 from 
/home/greearb/git/linux-linus/arch/x86/include/asm/pgtable.h:26,
 from /home/greearb/git/linux-linus/include/linux/pgtable.h:6,
 from /home/greearb/git/linux-linus/include/linux/kasan.h:14,
 from /home/greearb/git/linux-linus/include/linux/slab.h:136,
 from /home/greearb/git/linux-linus/include/linux/crypto.h:20,
 from 
/home/greearb/git/linux-linus/arch/x86/kernel/asm-offsets.c:9:
/home/greearb/git/linux-linus/include/asm-generic/siginfo.h:35:12: note: 
previous declaration of ‘copy_siginfo_to_user’ was here
   35 | extern int copy_siginfo_to_user(struct siginfo __user *to, const struct 
siginfo *from);
  |^~~~
make[2]: *** [/home/greearb/git/linux-linus/scripts/Makefile.build:117: 
arch/x86/kernel/asm-offsets.s] Error 1
make[1]: *** [/home/greearb/git/linux-linus/Makefile:1198: prepare0] Error 2
make: *** [/home/greearb/git/linux-linus/Makefile:185: __sub-make] Error 2


Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH v2 1/3] ath10k: Add history for tracking certain events

2020-07-31 Thread Ben Greear


On 7/31/20 11:27 AM, Rakesh Pillai wrote:

Add history for tracking the below events
- register read
- register write
- IRQ trigger
- NAPI poll
- CE service
- WMI cmd
- WMI event
- WMI tx completion

This will help in debugging any crash or any
improper behaviour.

Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1

Signed-off-by: Rakesh Pillai 
---
  drivers/net/wireless/ath/ath10k/ce.c  |   1 +
  drivers/net/wireless/ath/ath10k/core.h|  74 +
  drivers/net/wireless/ath/ath10k/debug.c   | 133 ++
  drivers/net/wireless/ath/ath10k/debug.h   |  74 +
  drivers/net/wireless/ath/ath10k/snoc.c|  15 +++-
  drivers/net/wireless/ath/ath10k/wmi-tlv.c |   1 +
  drivers/net/wireless/ath/ath10k/wmi.c |  10 +++
  7 files changed, 307 insertions(+), 1 deletion(-)




+void ath10k_record_wmi_event(struct ath10k *ar, enum ath10k_wmi_type type,
+u32 id, unsigned char *data)
+{
+   struct ath10k_wmi_event_entry *entry;
+   u32 idx;
+
+   if (type == ATH10K_WMI_EVENT) {
+   if (!ar->wmi_event_history.record)
+   return;


This check above is duplicated below, add it once at top of the method
instead.


+
+   spin_lock_bh(>wmi_event_history.hist_lock);
+   idx = ath10k_core_get_next_idx(>reg_access_history.index,
+  
ar->wmi_event_history.max_entries);
+   spin_unlock_bh(>wmi_event_history.hist_lock);
+   entry = >wmi_event_history.record[idx];
+   } else {
+   if (!ar->wmi_cmd_history.record)
+   return;
+
+   spin_lock_bh(>wmi_cmd_history.hist_lock);
+   idx = ath10k_core_get_next_idx(>reg_access_history.index,
+  ar->wmi_cmd_history.max_entries);
+   spin_unlock_bh(>wmi_cmd_history.hist_lock);
+   entry = >wmi_cmd_history.record[idx];
+   }
+
+   entry->timestamp = ath10k_core_get_timestamp();
+   entry->cpu_id = smp_processor_id();
+   entry->type = type;
+   entry->id = id;
+   memcpy(>data, data + 4, ATH10K_WMI_DATA_LEN);
+}
+EXPORT_SYMBOL(ath10k_record_wmi_event);



@@ -1660,6 +1668,11 @@ static int ath10k_snoc_probe(struct platform_device 
*pdev)
ar->ce_priv = _snoc->ce;
msa_size = drv_data->msa_size;
  
+	ath10k_core_reg_access_history_init(ar, ATH10K_REG_ACCESS_HISTORY_MAX);

+   ath10k_core_wmi_event_history_init(ar, ATH10K_WMI_EVENT_HISTORY_MAX);
+   ath10k_core_wmi_cmd_history_init(ar, ATH10K_WMI_CMD_HISTORY_MAX);
+   ath10k_core_ce_event_history_init(ar, ATH10K_CE_EVENT_HISTORY_MAX);


Maybe only enable this once user turns it on?  It sucks up a bit of memory?


+
ath10k_snoc_quirks_init(ar);
  
  	ret = ath10k_snoc_resource_init(ar);

diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c 
b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
index 932266d..9df5748 100644
--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
@@ -627,6 +627,7 @@ static void ath10k_wmi_tlv_op_rx(struct ath10k *ar, struct 
sk_buff *skb)
if (skb_pull(skb, sizeof(struct wmi_cmd_hdr)) == NULL)
goto out;
  
+	ath10k_record_wmi_event(ar, ATH10K_WMI_EVENT, id, skb->data);

trace_ath10k_wmi_event(ar, id, skb->data, skb->len);
  
  	consumed = ath10k_tm_event_wmi(ar, id, skb);

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
b/drivers/net/wireless/ath/ath10k/wmi.c
index a81a1ab..8ebd05c 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -1802,6 +1802,15 @@ struct sk_buff *ath10k_wmi_alloc_skb(struct ath10k *ar, 
u32 len)
  
  static void ath10k_wmi_htc_tx_complete(struct ath10k *ar, struct sk_buff *skb)

  {
+   struct wmi_cmd_hdr *cmd_hdr;
+   enum wmi_tlv_event_id id;
+
+   cmd_hdr = (struct wmi_cmd_hdr *)skb->data;
+   id = MS(__le32_to_cpu(cmd_hdr->cmd_id), WMI_CMD_HDR_CMD_ID);
+
+   ath10k_record_wmi_event(ar, ATH10K_WMI_TX_COMPL, id,
+   skb->data + sizeof(struct wmi_cmd_hdr));
+
dev_kfree_skb(skb);
  }


I think guard the above new code with if 
(unlikely(ar->ce_event_history.record)) { ... }

All in all, I think I'd want to compile this out (while leaving other debug 
compiled
in) since it seems this stuff would be rarely used and it adds method calls to 
hot
paths.

That is a decision for Kalle though, so see what he says...

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] ath10k: Add history for tracking certain events

2020-06-28 Thread Ben Greear





On 06/27/2020 10:12 PM, Rakesh Pillai wrote:




-Original Message-
From: Ben Greear 
Sent: Saturday, June 27, 2020 8:58 PM
To: Rakesh Pillai ; ath...@lists.infradead.org
Cc: linux-wirel...@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] ath10k: Add history for tracking certain events



On 06/26/2020 11:22 PM, Rakesh Pillai wrote:

For debugging many issues, a history of the
below mentioned events can help get an idea
of what exactly was going on just before any
issue occurred in the system. These event
history will be collected only when the host
driver is run in debug mode (i.e. with the
config ATH10K_DEBUG enabled).


This should be disabled by default unless user specifically pokes some
debugfs
value to turn it on so that it does not impact performance.


Hi Ben,
This history is enabled only if the user compiles the kernel with
ATH10K_DEBUG.
Making it runtime, adds a lot of "if" conditions for this history record.
Do you suggest to add support to enable/disable it runtime even in
ATH10K_DEBUG ?


Yes, because you are adding lots of locks/unlocks.  That is way more expensive
than an if statement.  You can add an 'unlikely' to the if check as well, so
compiler will optimize for this feature not being enabled.

Thanks,
Ben





Thanks,
Ben



Add history for tracking the below events
- register read
- register write
- IRQ trigger
- IRQ Enable
- IRQ Disable
- NAPI poll
- CE service
- WMI cmd
- WMI event
- WMI tx completion

This will help in debugging any crash or any
improper behaviour.



--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] ath10k: Add history for tracking certain events

2020-06-27 Thread Ben Greear





On 06/26/2020 11:22 PM, Rakesh Pillai wrote:

For debugging many issues, a history of the
below mentioned events can help get an idea
of what exactly was going on just before any
issue occurred in the system. These event
history will be collected only when the host
driver is run in debug mode (i.e. with the
config ATH10K_DEBUG enabled).


This should be disabled by default unless user specifically pokes some debugfs
value to turn it on so that it does not impact performance.

Thanks,
Ben



Add history for tracking the below events
- register read
- register write
- IRQ trigger
- IRQ Enable
- IRQ Disable
- NAPI poll
- CE service
- WMI cmd
- WMI event
- WMI tx completion

This will help in debugging any crash or any
improper behaviour.



--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 2/2] ath10k: Skip wait for delete response if firmware is down

2020-06-26 Thread Ben Greear





On 06/26/2020 11:11 AM, Rakesh Pillai wrote:

Currently the driver waits for response from the
firmware for all the delete cmds, eg: vdev_delete,
peer delete. If the firmware is down, these wait
will always timeout and return an error.

Also during subsytems recovery, any attempt to
send a WMI cmd to the FW will return the -ESHUTDOWN
status, which when returned to mac80211, can cause
unnecessary warnings to be printed on to the console,
as shown below

[ 2559.529565] Call trace:
[ 2559.532214]  __sta_info_destroy_part2+0x160/0x168 [mac80211]
[ 2559.538157]  __sta_info_flush+0x124/0x180 [mac80211]
[ 2559.543402]  ieee80211_set_disassoc+0x130/0x2c0 [mac80211]
[ 2559.549172]  ieee80211_mgd_deauth+0x238/0x25c [mac80211]
[ 2559.554764]  ieee80211_deauth+0x24/0x30 [mac80211]
[ 2559.559860]  cfg80211_mlme_deauth+0x258/0x2b0 [cfg80211]
[ 2559.565446]  nl80211_deauthenticate+0xe4/0x110 [cfg80211]
[ 2559.571064]  genl_rcv_msg+0x3a0/0x440
[ 2559.574888]  netlink_rcv_skb+0xb4/0x11c
[ 2559.578877]  genl_rcv+0x34/0x48
[ 2559.582162]  netlink_unicast+0x14c/0x1e4
[ 2559.586235]  netlink_sendmsg+0x2f0/0x360
[ 2559.590317]  sock_sendmsg+0x44/0x5c
[ 2559.593951]  sys_sendmsg+0x1c8/0x290
[ 2559.598029]  ___sys_sendmsg+0xa8/0xfc
[ 2559.601840]  __sys_sendmsg+0x8c/0xd0
[ 2559.605572]  __arm64_compat_sys_sendmsg+0x2c/0x38
[ 2559.610468]  el0_svc_common+0xa8/0x160
[ 2559.614372]  el0_svc_compat_handler+0x2c/0x38
[ 2559.618905]  el0_svc_compat+0x8/0x10

Skip the wait for delete response from the
firmware if the firmware is down. Also return
success to the mac80211 calls when the peer delete
cmd fails with return status -ESHUTDOWN.

Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1

Signed-off-by: Rakesh Pillai 
---
 drivers/net/wireless/ath/ath10k/mac.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index dc7befc..7ac6549 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -701,7 +701,8 @@ static void ath10k_wait_for_peer_delete_done(struct ath10k 
*ar, u32 vdev_id,
unsigned long time_left;
int ret;

-   if (test_bit(WMI_SERVICE_SYNC_DELETE_CMDS, ar->wmi.svc_map)) {
+   if (test_bit(WMI_SERVICE_SYNC_DELETE_CMDS, ar->wmi.svc_map) &&
+   test_bit(ATH10K_FLAG_CRASH_FLUSH, >dev_flags)) {


Don't you mean !test_bit(ATH10K_FLAG_CRASH_FLUSH, >dev_flags))  ???

Or maybe I'm just mis-reading your patch?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH v3 0/8] kernel: taint when the driver firmware crashes

2020-05-28 Thread Ben Greear





On 05/28/2020 07:27 AM, Luis Chamberlain wrote:

On Wed, May 27, 2020 at 02:36:42PM -0700, Jakub Kicinski wrote:

On Wed, 27 May 2020 03:19:18 + Luis Chamberlain wrote:

I read your patch, and granted, I will accept I was under the incorrect
assumption that this can only be used by networking devices, however it
the devlink approach achieves getting userspace the ability with
iproute2 devlink util to query a device health, on to which we can peg
firmware health. But *this* patch series is not about health status and
letting users query it, its about a *critical* situation which has come up
with firmware requiring me to reboot my system, and the lack of *any*
infrastructure in the kernel today to inform userspace about it.

So say we use netlink to report a critical health situation, how are we
informing userspace with your patch series about requring a reboot?


One of main features of netlink is pub/sub model of notifications.

Whatever you imagine listening to your uevent can listen to
devlink-health notifications via devlink.

In fact I've shown this off in the RFC patches I sent to you, see
the devlink mon health command being used.


Yes but I looked at iputils2 devlink and seems I made an incorrect
assumption this can only be used for a network device rather than
a struct device.

I'll take a second look.


Hello Jakub,

I'm thinking about something similar to what Luis is proposing, but in
my case I'd like to report just when the driver knows the hardware is gone
and cannot be recovered, like when this is reported:

[ 2548.851832] WARNING: CPU: 3 PID: 98 at 
backports-4.19.98-1/net/mac80211/util.c:2040 ieee80211_reconfig+0x98/0xb64 
[mac80211]
[ 2548.856020] Hardware became unavailable during restart.

I'd like to be able to tie this into a watch-dog program to allow automatic 
reboot
of the system soon after this event is seen, for instance.

Could you post your devlink RFC patches somewhere public?

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [RFC 1/2] devlink: add simple fw crash helpers

2020-05-25 Thread Ben Greear





On 05/25/2020 02:07 AM, Andy Shevchenko wrote:

On Fri, May 22, 2020 at 04:23:55PM -0700, Steve deRosier wrote:

On Fri, May 22, 2020 at 2:51 PM Luis Chamberlain  wrote:



I had to go RTFM re: kernel taints because it has been a very long
time since I looked at them. It had always seemed to me that most were
caused by "kernel-unfriendly" user actions.  The most famous of course
is loading proprietary modules, out-of-tree modules, forced module
loads, etc...  Honestly, I had forgotten the large variety of uses of
the taint flags. For anyone who hasn't looked at taints recently, I
recommend: 
https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html

In light of this I don't object to setting a taint on this anymore.
I'm a little uneasy, but I've softened on it now, and now I feel it
depends on implementation.

Specifically, I don't think we should set a taint flag when a driver
easily handles a routine firmware crash and is confident that things
have come up just fine again. In other words, triggering the taint in
every driver module where it spits out a log comment that it had a
firmware crash and had to recover seems too much. Sure, firmware
shouldn't crash, sure it should be open source so we can fix it,
whatever...


While it may sound idealistic the firmware for the end-user, and even for mere
kernel developer like me, is a complete blackbox which has more access than
root user in the kernel. We have tons of firmwares and each of them potentially
dangerous beast. As a user I really care about my data and privacy (hacker can
oops a firmware in order to set a specific vector attack). So, tainting kernel
is _a least_ we can do there, the strict rules would be to reboot immediately.


those sort of wishful comments simply ignore reality and
our ability to affect effective change.


We can encourage users not to buy cheap crap for the starter.


There is no stable wifi firmware for any price.

There is also no obvious feedback from even name-brand NICs like ath10k or AX200
when you report a crash.

That said, at least in my experience with ath10k-ct, the OS normally recovers 
fine
from firmware crashes.  ath10k already reports full crash reports on udev, so
easy for user-space to notice and report bug reports upstream if it cares to.  
Probably
other NICs do the same, and if not, they certainly could.

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Ben Greear





On 05/18/2020 10:09 AM, Luis Chamberlain wrote:

On Mon, May 18, 2020 at 09:58:53AM -0700, Ben Greear wrote:



On 05/18/2020 09:51 AM, Luis Chamberlain wrote:

On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote:

On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> 
module_firmware_crashed

You didn't CC me or the wireless list on the rest of the patches, so I'm
replying to a random one, but ...

What is the point here?

This should in no way affect the integrity of the system/kernel, for
most devices anyway.


Keyword you used here is "most device". And in the worst case, *who*
knows what other odd things may happen afterwards.


So what if ath10k's firmware crashes? If there's a driver bug it will
not handle it right (and probably crash, WARN_ON, or something else),
but if the driver is working right then that will not affect the kernel
at all.


Sometimes the device can go into a state which requires driver removal
and addition to get things back up.


It would be lovely to be able to detect this case in the driver/system
somehow!  I haven't seen any such cases recently,


I assure you that I have run into it. Once it does again I'll report
the crash, but the problem with some of this is that unless you scrape
the log you won't know. Eventually, a uevent would indeed tell inform
me.


but in case there is
some common case you see, maybe we can think of a way to detect it?


ath10k is just one case, this patch series addresses a simple way to
annotate this tree-wide.


So maybe I can understand that maybe you want an easy way to discover -
per device - that the firmware crashed, but that still doesn't warrant a
complete kernel taint.


That is one reason, another is that a taint helps support cases *fast*
easily detect if the issue was a firmware crash, instead of scraping
logs for driver specific ways to say the firmware has crashed.


You can listen for udev events (I think that is the right term),
and find crashes that way.  You get the actual crash info as well.


My follow up to this was to add uevent to add_taint() as well, this way
these could generically be processed by userspace.


I'm not opposed to the taint, though I have not thought much on it.

But, if you can already get the crash info from uevent, and it automatically
comes without polling or scraping logs, then what benefit beyond that does
the taint give you?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-18 Thread Ben Greear





On 05/18/2020 09:51 AM, Luis Chamberlain wrote:

On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote:

On Fri, 2020-05-15 at 21:28 +, Luis Chamberlain wrote:> 
module_firmware_crashed

You didn't CC me or the wireless list on the rest of the patches, so I'm
replying to a random one, but ...

What is the point here?

This should in no way affect the integrity of the system/kernel, for
most devices anyway.


Keyword you used here is "most device". And in the worst case, *who*
knows what other odd things may happen afterwards.


So what if ath10k's firmware crashes? If there's a driver bug it will
not handle it right (and probably crash, WARN_ON, or something else),
but if the driver is working right then that will not affect the kernel
at all.


Sometimes the device can go into a state which requires driver removal
and addition to get things back up.


It would be lovely to be able to detect this case in the driver/system
somehow!  I haven't seen any such cases recently, but in case there is
some common case you see, maybe we can think of a way to detect it?




So maybe I can understand that maybe you want an easy way to discover -
per device - that the firmware crashed, but that still doesn't warrant a
complete kernel taint.


That is one reason, another is that a taint helps support cases *fast*
easily detect if the issue was a firmware crash, instead of scraping
logs for driver specific ways to say the firmware has crashed.


You can listen for udev events (I think that is the right term),
and find crashes that way.  You get the actual crash info as well.

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] ath10k: increase rx buffer size to 2048

2020-04-28 Thread Ben Greear





On 04/28/2020 05:01 AM, Kalle Valo wrote:

Sven Eckelmann  writes:


On Wednesday, 1 April 2020 09:00:49 CEST Sven Eckelmann wrote:

On Wednesday, 5 February 2020 20:10:43 CEST Linus Lüssing wrote:

From: Linus Lüssing 

Before, only frames with a maximum size of 1528 bytes could be
transmitted between two 802.11s nodes.

For batman-adv for instance, which adds its own header to each frame,
we typically need an MTU of at least 1532 bytes to be able to transmit
without fragmentation.

This patch now increases the maxmimum frame size from 1528 to 1656
bytes.

[...]

@Kalle, I saw that this patch was marked as deferred [1] but I couldn't find
any mail why it was done so. It seems like this currently creates real world
problems - so would be nice if you could explain shortly what is currently
blocking its acceptance.


Ping?


Sorry for the delay, my plan was to first write some documentation about
different hardware families but haven't managed to do that yet.

My problem with this patch is that I don't know what hardware and
firmware versions were tested, so it needs analysis before I feel safe
to apply it. The ath10k hardware families are very different that even
if a patch works perfectly on one ath10k hardware it could still break
badly on another one.

What makes me faster to apply ath10k patches is to have comprehensive
analysis in the commit log. This shows me the patch author has
considered about all hardware families, not just the one he is testing
on, and that I don't need to do the analysis myself.


It has been in ath10k-ct for a while, and that has some fairly wide coverage
in OpenWrt, so likely if there were problems we would have seen it already.

I did not make any specific changes to firmware to support this, so upstream
firmware should behave similarly.

Seems like upstream ath10k could really benefit from having some test beds
so you can actually test code on different chips and have confidence
in your changes!

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())

2019-09-11 Thread Ben Greear





On 09/11/2019 06:21 AM, Linus Torvalds wrote:

On Wed, Sep 11, 2019 at 2:03 PM Ben Greear  wrote:


Out of curiosity, I'm interested to know what ath10k NIC chipset this is from.


It's a Dell XPS 13 9380, with

  02:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac
Wireless Network Adapter (rev 32)
Subsystem: Bigfoot Networks, Inc. Killer 1435 Wireless-AC

(numeric PCI ID 168c:003e, subsystem 1a56:143a).

The ath10k driver says

qca6174 hw3.2 target 0x0503 chip_id 0x00340aff sub 1a56:143a
firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1 api 6 features
wowlan,ignore-otp,mfp crc32 29eb8ca1
board_file api 2 bmi_id N/A crc32 4ed3569e

if that tells you anything more.


That means it is something I have never used nor have firmware for, but
the WMI logic should be similar to what I described and have experienced
with other chips.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())

2019-09-11 Thread Ben Greear





On 09/11/2019 05:04 AM, Johannes Berg wrote:

On Wed, 2019-09-11 at 12:58 +0100, Linus Torvalds wrote:


And I didn't think about it or double-check, because the errors that
then followed later _looked_ like that TX power failing that I thought
hadn't happened.


Yeah, it could be something already got stuck there, hard to say.


Since we see that something actually did an rfkill operation. Did you
push a button there?


No, I tried to turn off and turn on Wifi manually (no button, just the
settings panel).


That does usually also cause rfkill, so that explains how we got down
this particular code path.


I didn't notice the WARN_ON(), I just noticed that there was no
networking, and "turn it off and on again" is obviously the first
thing to try ;)


:-)


Sep 11 10:27:13 xps13 kernel: WARNING: CPU: 4 PID: 1246 at
net/mac80211/sta_info.c:1057 __sta_info_destroy_part2+0x147/0x150
[mac80211]

but if you want full logs I can send them in private to you.


No, it's fine, though maybe Kalle does - he was stepping out for a while
but said he'd look later.

This is the interesting time - 10:27:13 we get one of the first
failures. Really the first one was this:


Sep 11 10:27:07 xps13 kernel: ath10k_pci :02:00.0: wmi command 16387 
timeout, restarting hardware




I do suspect it's atheros and suspend/resume or something. The
wireless clearly worked for a while after the resume, but then at some
point it stopped.


I'm not really sure it's related to suspend/resume at all, the firmware
seems to just have gotten stuck, and the device and firmware most likely
got reset over the suspend/resume anyway.


The only explanation I therefore have is that something is just taking
*forever* in that code path, hence my question about timing information
on the logs.


Yeah, maybe it would time out everything eventually. But not for a
long time. It hadn't cleared up by

  Sep 11 10:36:21 xps13 gnome-session-f[6837]: gnome-session-failed:
Fatal IO error 0 (Success) on X server :0.


Ok, that's way longer than I would have guessed even! That's over 9
minutes, that'd be close to 200 commands having to be issued and timing
out ...

I don't know. What I wrote before is basically all I can say, I think
the driver gets stuck somewhere waiting for the device "forever", and
the stack just doesn't get to release the lock, causing all the follow-
up problems.


It looks to me like the ath10k firmware is not responding to commands and/or
is out of its WMI tx credits.  The code often takes a lock and then blocks for 
up to 3
or so seconds waiting for a response from the firmware, and the mac80211 calling
code is often already holding rtnl.  Pretty much every mac80211 call will cause 
a
WMI message and thus potentially hit this timeout.

This can easily cause rtnl to be held for 3 seconds, but after that, I believe
upstream ath10k will now time out and kill the firmware and restart.  (I run
a significantly modified ath10k driver, and that is how mine works, at least.)

In this case, it looks like restarting the firmware/NIC failed, and I guess
that must get it in a state where it is still blocking and trying to talk
to the firmware?  Or maybe deadlock down inside ath10k driver.

For what it's worth, we see that WARN_ON often when ath10k firmware crashes, 
but it
seems to not be a big deal and the system normally recovers fine.

Out of curiosity, I'm interested to know what ath10k NIC chipset this is from.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2019-01-08 Thread Ben Greear


On 1/7/19 9:57 AM, Tim Chen wrote:

On 12/31/18 8:22 AM, Ben Greear wrote:



On 12/21/2018 05:17 PM, Tim Chen wrote:


If you don't worry about security and performance is paramount, then
boot with "nospectre_v2".  That's explained in the document.


There seem to be lots of different variants of this type of problem.  It was 
not clear
to me that just doing nospectre_v2 would be sufficient to get back full 
performance.


The performance penalty comes from retpoline penalizing indirect branch 
predictions in kernel.
With nospectre_v2, retpoline is disabled so you should get all the performance
back from spectre mitigation.

This does not disable kernel page table isolation for meltdown mitigation, 
which also
needs to be turned off if you want to get the full performance back.  That's 
somewhat
beyond the scope of this doc on Spectre.


The two bug families (spectre and meltdown) are conflated in my mind, at least.

For those of us who do not really understand this stuff in detail, it would
be good to at least mention some notes about Meltdown I think.


And anyway, I would like to compile the kernel to not need that command-line 
option,
so I am still interesting in what compile options need to be set to what 
values...



If you just want to disable spectre mitigation, setting CONFIG_RETPOLINE=n 
should do
the  trick.  If you also want to disable meltdown mitigation,
set CONFIG_PAGE_TABLE_ISOLATION=n.


Ok, are there any other CONFIG options that relate to fixing security bugs that
have noticeable performance impacts or are these two the complete list?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2018-12-31 Thread Ben Greear





On 12/21/2018 05:17 PM, Tim Chen wrote:

On 12/21/18 1:59 PM, Ben Greear wrote:

On 12/21/18 9:44 AM, Tim Chen wrote:

Thomas,

Andi and I have made an update to our draft of the Spectre admin guide.
We may be out on Christmas vacation for a while.  But we want to
send it out for everyone to take a look.


Can you add a section on how to compile out all mitigations that have anything
beyond negligible performance impact for those running systems where performance
is more important than security?



If you don't worry about security and performance is paramount, then
boot with "nospectre_v2".  That's explained in the document.


There seem to be lots of different variants of this type of problem.  It was 
not clear
to me that just doing nospectre_v2 would be sufficient to get back full 
performance.

And anyway, I would like to compile the kernel to not need that command-line 
option,
so I am still interesting in what compile options need to be set to what 
values...

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2018-12-21 Thread Ben Greear


On 12/21/18 9:44 AM, Tim Chen wrote:

Thomas,

Andi and I have made an update to our draft of the Spectre admin guide.
We may be out on Christmas vacation for a while.  But we want to
send it out for everyone to take a look.


Can you add a section on how to compile out all mitigations that have anything
beyond negligible performance impact for those running systems where performance
is more important than security?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Oops on the system startup in the function part_in_flight()

2018-05-04 Thread Ben Greear


Hello,

I am trying to bisect a pktgen related performance regression that appears to
be between the 4.13 and 4.14 kernels.  But, my system keeps crashing due to 
part_in_flight
oops so bisecting is not going well.

It looks like this oops was fixed, but the link to the patch in the email is no 
longer
valid.  Can someone let me know what patch fixes this crash so I can apply it 
while
bisecting?

https://www.spinics.net/lists/linux-block/msg17809.html

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Oops on the system startup in the function part_in_flight()

2018-05-04 Thread Ben Greear


Hello,

I am trying to bisect a pktgen related performance regression that appears to
be between the 4.13 and 4.14 kernels.  But, my system keeps crashing due to 
part_in_flight
oops so bisecting is not going well.

It looks like this oops was fixed, but the link to the patch in the email is no 
longer
valid.  Can someone let me know what patch fixes this crash so I can apply it 
while
bisecting?

https://www.spinics.net/lists/linux-block/msg17809.html

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-20 Thread Ben Greear


On 03/20/2018 09:44 AM, Liran Alon wrote:



On 20/03/18 18:24, ebied...@xmission.com wrote:


I don't believe the current behavior is a bug.

I looked through the history.  Basically skb_scrub_packet
started out as the scrubbing needed for crossing network
namespaces.

Then tunnels which needed 90% of the functionality started
calling it, with the xnet flag added.  Because the tunnels
needed to preserve their historic behavior.

Then dev_forward_skb started calling skb_scrub_packet.

A veth pair is supposed to give the same behavior as a cross-over
cable plugged into two local nics.  A cross over cable won't
preserve things like the skb mark.  So I don't see why anyone would
expect a veth pair to preserve the mark.


I disagree with this argument.

I think that a skb crossing netns is what simulates a real packet crossing 
physical computers. Following your argument, why would skb->mark should be 
preserved
when crossing netdevs on same netns via routing? But this does today preserve 
skb->mark.

Therefore, I do think that skb->mark should conceptually only be scrubbed when 
crossing netns. Regardless of the netdev used to cross it.


It should be scrubbed in VETH as well.  That is one way to make virtual 
routers.  Possibly
the newer VRF features will give another better way to do it, but you should 
not break
things that used to work.

Now, if you want to add a new feature that allows one to configure the kernel 
(or VETH) for
a new behavior, then that might be something to consider.


Right now I don't see the point of handling packets that don't cross
network namespace boundaries specially, other than to preserve backwards
compatibility.


Well, backwards compat is a big deal all by itself!

Thanks,
Ben



Eric







--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-20 Thread Ben Greear


On 03/20/2018 09:44 AM, Liran Alon wrote:



On 20/03/18 18:24, ebied...@xmission.com wrote:


I don't believe the current behavior is a bug.

I looked through the history.  Basically skb_scrub_packet
started out as the scrubbing needed for crossing network
namespaces.

Then tunnels which needed 90% of the functionality started
calling it, with the xnet flag added.  Because the tunnels
needed to preserve their historic behavior.

Then dev_forward_skb started calling skb_scrub_packet.

A veth pair is supposed to give the same behavior as a cross-over
cable plugged into two local nics.  A cross over cable won't
preserve things like the skb mark.  So I don't see why anyone would
expect a veth pair to preserve the mark.


I disagree with this argument.

I think that a skb crossing netns is what simulates a real packet crossing 
physical computers. Following your argument, why would skb->mark should be 
preserved
when crossing netdevs on same netns via routing? But this does today preserve 
skb->mark.

Therefore, I do think that skb->mark should conceptually only be scrubbed when 
crossing netns. Regardless of the netdev used to cross it.


It should be scrubbed in VETH as well.  That is one way to make virtual 
routers.  Possibly
the newer VRF features will give another better way to do it, but you should 
not break
things that used to work.

Now, if you want to add a new feature that allows one to configure the kernel 
(or VETH) for
a new behavior, then that might be something to consider.


Right now I don't see the point of handling packets that don't cross
network namespace boundaries specially, other than to preserve backwards
compatibility.


Well, backwards compat is a big deal all by itself!

Thanks,
Ben



Eric







--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH net] virtio-net: disable NAPI only when enabled during XDP set

2018-02-28 Thread Ben Greear

On 02/28/2018 09:22 AM, David Miller wrote:

From: Jason Wang <jasow...@redhat.com>
Date: Wed, 28 Feb 2018 18:20:04 +0800

We try to disable NAPI to prevent a single XDP TX queue being used by
multiple cpus. But we don't check if device is up (NAPI is enabled),
this could result stall because of infinite wait in
napi_disable(). Fixing this by checking device state through
netif_running() before.

Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang <jasow...@redhat.com>

Yes, mis-paired NAPI enable/disable are really a pain.

Probably, we can do something in the interfaces or mechanisms to make
this less error prone and less fragile.

Anyways, applied and queued up for -stable, thanks!

I just hit a similar bug in ath10k.  It seems like napi has plenty
of free bit flags so it could keep track of 'is-enabled' state and
allow someone to call napi_disable multiple times w/out deadlocking.

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH net] virtio-net: disable NAPI only when enabled during XDP set

2018-02-28 Thread Ben Greear

On 02/28/2018 09:22 AM, David Miller wrote:

From: Jason Wang 
Date: Wed, 28 Feb 2018 18:20:04 +0800

We try to disable NAPI to prevent a single XDP TX queue being used by
multiple cpus. But we don't check if device is up (NAPI is enabled),
this could result stall because of infinite wait in
napi_disable(). Fixing this by checking device state through
netif_running() before.

Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang 

Yes, mis-paired NAPI enable/disable are really a pain.

Probably, we can do something in the interfaces or mechanisms to make
this less error prone and less fragile.

Anyways, applied and queued up for -stable, thanks!

I just hit a similar bug in ath10k.  It seems like napi has plenty
of free bit flags so it could keep track of 'is-enabled' state and
allow someone to call napi_disable multiple times w/out deadlocking.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()

2017-02-10 Thread Ben Greear


On 02/09/2017 11:03 PM, Valo, Kalle wrote:

Ben Greear <gree...@candelatech.com> writes:


On 02/07/2017 01:14 AM, Valo, Kalle wrote:

Adrian Chadd <adr...@freebsd.org> writes:


Removing this method makes the diff to FreeBSD larger, as "vif" in
FreeBSD is a different pointer.

(Yes, I have ath10k on freebsd working and I'd like to find a way to
reduce the diff moving forward.)


I don't like this "(void *) vif->drv_priv" style that much either but
apparently it's commonly used in Linux wireless code and already parts
of ath10k. So this patch just unifies the coding style.


Surely the code compiles to the same thing, so why add a patch that
makes it more difficult for Adrian and makes the code no easier to read
for the rest of us?


Because that's the coding style used already in Linux. It's great to see
that parts of ath10k can be used also in other systems but in principle
I'm not very fond of the idea starting to reject valid upstream patches
because of driver forks.


There are lots of people trying to maintain out-of-tree or backported patches 
to ath10k,
and every time there is a meaningless style change, that just makes us
waste more time on useless work instead of having time to work on more important
matters.

Thanks,
Ben


I think backports project is doing it right, it's not limiting upstream
development in any way and handles all the API changes internally. Maybe
FreeBSD could do something similar?




--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()

2017-02-10 Thread Ben Greear


On 02/09/2017 11:03 PM, Valo, Kalle wrote:

Ben Greear  writes:


On 02/07/2017 01:14 AM, Valo, Kalle wrote:

Adrian Chadd  writes:


Removing this method makes the diff to FreeBSD larger, as "vif" in
FreeBSD is a different pointer.

(Yes, I have ath10k on freebsd working and I'd like to find a way to
reduce the diff moving forward.)


I don't like this "(void *) vif->drv_priv" style that much either but
apparently it's commonly used in Linux wireless code and already parts
of ath10k. So this patch just unifies the coding style.


Surely the code compiles to the same thing, so why add a patch that
makes it more difficult for Adrian and makes the code no easier to read
for the rest of us?


Because that's the coding style used already in Linux. It's great to see
that parts of ath10k can be used also in other systems but in principle
I'm not very fond of the idea starting to reject valid upstream patches
because of driver forks.


There are lots of people trying to maintain out-of-tree or backported patches 
to ath10k,
and every time there is a meaningless style change, that just makes us
waste more time on useless work instead of having time to work on more important
matters.

Thanks,
Ben


I think backports project is doing it right, it's not limiting upstream
development in any way and handles all the API changes internally. Maybe
FreeBSD could do something similar?




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()

2017-02-07 Thread Ben Greear




On 02/07/2017 01:14 AM, Valo, Kalle wrote:

Adrian Chadd <adr...@freebsd.org> writes:


Removing this method makes the diff to FreeBSD larger, as "vif" in
FreeBSD is a different pointer.

(Yes, I have ath10k on freebsd working and I'd like to find a way to
reduce the diff moving forward.)


I don't like this "(void *) vif->drv_priv" style that much either but
apparently it's commonly used in Linux wireless code and already parts
of ath10k. So this patch just unifies the coding style.


Surely the code compiles to the same thing, so why add a patch that
makes it more difficult for Adrian and makes the code no easier to read
for the rest of us?

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 1/3] ath10k: remove ath10k_vif_to_arvif()

2017-02-07 Thread Ben Greear




On 02/07/2017 01:14 AM, Valo, Kalle wrote:

Adrian Chadd  writes:


Removing this method makes the diff to FreeBSD larger, as "vif" in
FreeBSD is a different pointer.

(Yes, I have ath10k on freebsd working and I'd like to find a way to
reduce the diff moving forward.)


I don't like this "(void *) vif->drv_priv" style that much either but
apparently it's commonly used in Linux wireless code and already parts
of ath10k. So this patch just unifies the coding style.


Surely the code compiles to the same thing, so why add a patch that
makes it more difficult for Adrian and makes the code no easier to read
for the rest of us?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] crypto: ccm - avoid scatterlist for MAC encryption

2016-10-19 Thread Ben Greear




On 10/19/2016 08:08 AM, Ard Biesheuvel wrote:

On 19 October 2016 at 08:43, Johannes Berg <johan...@sipsolutions.net> wrote:

On Wed, 2016-10-19 at 11:31 +0800, Herbert Xu wrote:



We could probably make mac80211 do that too, but can we guarantee in-
order processing? Anyway, it's pretty low priority, maybe never
happening, since hardly anyone really uses "software" crypto, the wifi
devices mostly have it built in anyway.



Indeed. The code is now correct in terms of API requirements, so let's
just wait for someone to complain about any performance regressions.


Do you actually expect performance regressions?  I'll be complaining if
so, but will test first :)

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] crypto: ccm - avoid scatterlist for MAC encryption

2016-10-19 Thread Ben Greear




On 10/19/2016 08:08 AM, Ard Biesheuvel wrote:

On 19 October 2016 at 08:43, Johannes Berg  wrote:

On Wed, 2016-10-19 at 11:31 +0800, Herbert Xu wrote:



We could probably make mac80211 do that too, but can we guarantee in-
order processing? Anyway, it's pretty low priority, maybe never
happening, since hardly anyone really uses "software" crypto, the wifi
devices mostly have it built in anyway.



Indeed. The code is now correct in terms of API requirements, so let's
just wait for someone to complain about any performance regressions.


Do you actually expect performance regressions?  I'll be complaining if
so, but will test first :)

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Question on kzalloc and GFP_DMA32

2016-09-28 Thread Ben Greear


On 09/28/2016 02:11 PM, David Rientjes wrote:

On Tue, 27 Sep 2016, Ben Greear wrote:



I have been running this patch for a while:

ath10k:  Use GPF_DMA32 for firmware swap memory.

This fixes OS crash when using QCA 9984 NIC on x86-64 system
without vt-d enabled.

Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

All tests were with CT firmware.

Signed-off-by: Ben Greear <gree...@candelatech.com>

 drivers/net/wireless/ath/ath10k/wmi.c

index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32
req_id,
 if (!pool_size)
 return -EINVAL;

-vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
 if (!vaddr)
 num_units /= 2;
 }


It sometimes works, on some systems, but then it also often fails with the
splat
below, especially with several NICs in the system.



I suppose it's failing sometimes because the BUG() will trigger when
trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured.  That
shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do
not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing
kzalloc(..., ... | GFP_DMA32) anyway.


CONFIG_ZONE_DMA32 is enabled in my .config.


pool_size is relatively large (maybe 256k or so).



If it's 256k, why allocate through the slab allocator?  Why not
alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN,
get_order(pool_size))?


I really don't understand the (subtle?) difference between alloc_pages and 
kzalloc,
but I will give your suggestion a try and see if it works.  If you have time, 
maybe you could take
a look at drivers/net/wireless/ath/ath10k/wmi.c in the ath10k_wmi_alloc_chunk 
method
to see if you notice any problems with using alloc_pages there?

Thanks for the suggestion.


--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Question on kzalloc and GFP_DMA32

2016-09-28 Thread Ben Greear


On 09/28/2016 02:11 PM, David Rientjes wrote:

On Tue, 27 Sep 2016, Ben Greear wrote:



I have been running this patch for a while:

ath10k:  Use GPF_DMA32 for firmware swap memory.

This fixes OS crash when using QCA 9984 NIC on x86-64 system
without vt-d enabled.

Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

All tests were with CT firmware.

Signed-off-by: Ben Greear 

 drivers/net/wireless/ath/ath10k/wmi.c

index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32
req_id,
 if (!pool_size)
 return -EINVAL;

-vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
 if (!vaddr)
 num_units /= 2;
 }


It sometimes works, on some systems, but then it also often fails with the
splat
below, especially with several NICs in the system.



I suppose it's failing sometimes because the BUG() will trigger when
trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured.  That
shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do
not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing
kzalloc(..., ... | GFP_DMA32) anyway.


CONFIG_ZONE_DMA32 is enabled in my .config.


pool_size is relatively large (maybe 256k or so).



If it's 256k, why allocate through the slab allocator?  Why not
alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN,
get_order(pool_size))?


I really don't understand the (subtle?) difference between alloc_pages and 
kzalloc,
but I will give your suggestion a try and see if it works.  If you have time, 
maybe you could take
a look at drivers/net/wireless/ath/ath10k/wmi.c in the ath10k_wmi_alloc_chunk 
method
to see if you notice any problems with using alloc_pages there?

Thanks for the suggestion.


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Question on kzalloc and GFP_DMA32

2016-09-27 Thread Ben Greear



I have been running this patch for a while:

ath10k:  Use GPF_DMA32 for firmware swap memory.

This fixes OS crash when using QCA 9984 NIC on x86-64 system
without vt-d enabled.

Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

All tests were with CT firmware.

Signed-off-by: Ben Greear <gree...@candelatech.com>

 drivers/net/wireless/ath/ath10k/wmi.c 
index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 
req_id,
 if (!pool_size)
 return -EINVAL;

-vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
 if (!vaddr)
 num_units /= 2;
 }


It sometimes works, on some systems, but then it also often fails with the splat
below, especially with several NICs in the system.

pool_size is relatively large (maybe 256k or so).

Any idea for a more proper way to do this?


gfp: 4
[ cut here ]
kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508!
invalid opcode:  [#1] PREEMPT SMP
Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 
iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6

CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 
4.6.5 06/07/2013
Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core]
task: 880036433a00 ti: 88003644 task.ti: 88003644
RIP: 0010:[]  [] new_slab+0x39a/0x410
RSP: 0018:880036443b58  EFLAGS: 00010092
RAX: 0006 RBX: 024082c4 RCX: 
RDX: 0006 RSI: 88021e30dd08 RDI: 88021e30dd08
RBP: 880036443b90 R08:  R09: 
R10:  R11: 0372 R12: 88021dc01200
R13: 88021dc00cc0 R14: 88021dc01200 R15: 0001
FS:  () GS:88021e30() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f3e65c1c730 CR3: 01e06000 CR4: 001406e0
Stack:
 8127a4fc 0a01ff10 024082c4 88021dc01200
 88021dc00cc0 88021dc01200 0001 880036443c58
 81247ac6 88021e31b360 880036433a00 880036433a00
Call Trace:
 [] ? __d_lookup+0x9c/0x160
 [] ___slab_alloc+0x396/0x4a0
 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] ? alloc_kmem_pages+0x9/0x10
 [] ? kmalloc_order+0x13/0x40
 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] __slab_alloc.isra.72+0x26/0x40
 [] __kmalloc+0x147/0x1b0
 [] ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] ? dequeue_entity+0x261/0xac0
 [] process_one_work+0x148/0x420
 [] worker_thread+0x49/0x480
 [] ? rescuer_thread+0x330/0x330
 [] kthread+0xc4/0xe0
 [] ret_from_fork+0x1f/0x40
 [] ? kthread_create_on_node+0x170/0x170
Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 
5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18

RIP  [] new_slab+0x39a/0x410
 RSP 
---[ end trace ea3b0043b2911d93 ]---


static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
BUG();
}

return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
}


Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Question on kzalloc and GFP_DMA32

2016-09-27 Thread Ben Greear



I have been running this patch for a while:

ath10k:  Use GPF_DMA32 for firmware swap memory.

This fixes OS crash when using QCA 9984 NIC on x86-64 system
without vt-d enabled.

Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

All tests were with CT firmware.

Signed-off-by: Ben Greear 

 drivers/net/wireless/ath/ath10k/wmi.c 
index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 
req_id,
 if (!pool_size)
 return -EINVAL;

-vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
 if (!vaddr)
 num_units /= 2;
 }


It sometimes works, on some systems, but then it also often fails with the splat
below, especially with several NICs in the system.

pool_size is relatively large (maybe 256k or so).

Any idea for a more proper way to do this?


gfp: 4
[ cut here ]
kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508!
invalid opcode:  [#1] PREEMPT SMP
Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 
iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6

CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 
4.6.5 06/07/2013
Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core]
task: 880036433a00 ti: 88003644 task.ti: 88003644
RIP: 0010:[]  [] new_slab+0x39a/0x410
RSP: 0018:880036443b58  EFLAGS: 00010092
RAX: 0006 RBX: 024082c4 RCX: 
RDX: 0006 RSI: 88021e30dd08 RDI: 88021e30dd08
RBP: 880036443b90 R08:  R09: 
R10:  R11: 0372 R12: 88021dc01200
R13: 88021dc00cc0 R14: 88021dc01200 R15: 0001
FS:  () GS:88021e30() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f3e65c1c730 CR3: 01e06000 CR4: 001406e0
Stack:
 8127a4fc 0a01ff10 024082c4 88021dc01200
 88021dc00cc0 88021dc01200 0001 880036443c58
 81247ac6 88021e31b360 880036433a00 880036433a00
Call Trace:
 [] ? __d_lookup+0x9c/0x160
 [] ___slab_alloc+0x396/0x4a0
 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] ? alloc_kmem_pages+0x9/0x10
 [] ? kmalloc_order+0x13/0x40
 [] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] __slab_alloc.isra.72+0x26/0x40
 [] __kmalloc+0x147/0x1b0
 [] ath10k_wmi_event_service_ready_work+0x5ad/0x800 
[ath10k_core]
 [] ? dequeue_entity+0x261/0xac0
 [] process_one_work+0x148/0x420
 [] worker_thread+0x49/0x480
 [] ? rescuer_thread+0x330/0x330
 [] kthread+0xc4/0xe0
 [] ret_from_fork+0x1f/0x40
 [] ? kthread_create_on_node+0x170/0x170
Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 
5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18

RIP  [] new_slab+0x39a/0x410
 RSP 
---[ end trace ea3b0043b2911d93 ]---


static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
BUG();
}

return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
}


Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns

2016-06-08 Thread Ben Greear


On 06/08/2016 08:56 AM, Prasun Maiti wrote:

Please help me how to test this one?? It will be great to me if you help me.


I do not have time or interest, sorry.  You basically need access to
firmware source to do any useful development on this driver in my opinion,
and I do not have access to that source.

Why are you so concerned about the warning anyway?

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:21 PM, Ben Greear <gree...@candelatech.com> wrote:

On 06/08/2016 08:46 AM, Prasun Maiti wrote:


Please tell me if I mention that this code is untested in commit log,
then could you check the code kindly and also help me to fix this type
of warning?



In my experience, ath6kl has very fragile and buggy firmware, so I would
not add any new API to it unless you have tested this thoroughly.

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle <kv...@qca.qualcomm.com>
wrote:


Prasun Maiti <prasunmait...@gmail.com> writes:


I am not sure it works fine. Like ath6kl driver send another cmd to
firmare, I have just filled up the cmd buffer with "tx_ant", and
"rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd
buffer to firmware. I have resend the patch as there are some errors
in the previous patch. Let me know if any modifications are needed?



I don't take untested code. In some special cases it might be ok to send
untested code but even then it needs to be clearly stated in the commit
log that it's untested.

Please resend once you have tested this, I'm dropping this now.

--
Kalle Valo








--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com








--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns

2016-06-08 Thread Ben Greear


On 06/08/2016 08:56 AM, Prasun Maiti wrote:

Please help me how to test this one?? It will be great to me if you help me.


I do not have time or interest, sorry.  You basically need access to
firmware source to do any useful development on this driver in my opinion,
and I do not have access to that source.

Why are you so concerned about the warning anyway?

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:21 PM, Ben Greear  wrote:

On 06/08/2016 08:46 AM, Prasun Maiti wrote:


Please tell me if I mention that this code is untested in commit log,
then could you check the code kindly and also help me to fix this type
of warning?



In my experience, ath6kl has very fragile and buggy firmware, so I would
not add any new API to it unless you have tested this thoroughly.

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle 
wrote:


Prasun Maiti  writes:


I am not sure it works fine. Like ath6kl driver send another cmd to
firmare, I have just filled up the cmd buffer with "tx_ant", and
"rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd
buffer to firmware. I have resend the patch as there are some errors
in the previous patch. Let me know if any modifications are needed?



I don't take untested code. In some special cases it might be ok to send
untested code but even then it needs to be clearly stated in the commit
log that it's untested.

Please resend once you have tested this, I'm dropping this now.

--
Kalle Valo








--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com








--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns

2016-06-08 Thread Ben Greear


On 06/08/2016 08:46 AM, Prasun Maiti wrote:

Please tell me if I mention that this code is untested in commit log,
then could you check the code kindly and also help me to fix this type
of warning?


In my experience, ath6kl has very fragile and buggy firmware, so I would
not add any new API to it unless you have tested this thoroughly.

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle <kv...@qca.qualcomm.com> wrote:

Prasun Maiti <prasunmait...@gmail.com> writes:


I am not sure it works fine. Like ath6kl driver send another cmd to
firmare, I have just filled up the cmd buffer with "tx_ant", and
"rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd
buffer to firmware. I have resend the patch as there are some errors
in the previous patch. Let me know if any modifications are needed?


I don't take untested code. In some special cases it might be ok to send
untested code but even then it needs to be clearly stated in the commit
log that it's untested.

Please resend once you have tested this, I'm dropping this now.

--
Kalle Valo







--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH] Add .set_antenna callback in ath6kl driver to remove wireless core warns

2016-06-08 Thread Ben Greear


On 06/08/2016 08:46 AM, Prasun Maiti wrote:

Please tell me if I mention that this code is untested in commit log,
then could you check the code kindly and also help me to fix this type
of warning?


In my experience, ath6kl has very fragile and buggy firmware, so I would
not add any new API to it unless you have tested this thoroughly.

Thanks,
Ben



On Wed, Jun 8, 2016 at 9:00 PM, Valo, Kalle  wrote:

Prasun Maiti  writes:


I am not sure it works fine. Like ath6kl driver send another cmd to
firmare, I have just filled up the cmd buffer with "tx_ant", and
"rx_ant" values, then use "ath6kl_wmi_cmd_send()" api to send the cmd
buffer to firmware. I have resend the patch as there are some errors
in the previous patch. Let me know if any modifications are needed?


I don't take untested code. In some special cases it might be ok to send
untested code but even then it needs to be clearly stated in the commit
log that it's untested.

Please resend once you have tested this, I'm dropping this now.

--
Kalle Valo







--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.

2016-05-13 Thread Ben Greear

On 05/13/2016 11:21 AM, David Miller wrote:

From: Ben Greear <gree...@candelatech.com>
Date: Fri, 13 May 2016 09:57:19 -0700

How do you feel about a new socket-option to allow a socket to
request the old veth behaviour?

I depend upon the opinions of the experts who work upstream on and
maintain these components, since it is an area I am not so familiar
with.

Generally speaking asking me directly for opinions on matters like
this isn't the way to go, in fact I kind of find it irritating.  It
can't all be on me.

Fair enough, thanks for your time.

Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.

2016-05-13 Thread Ben Greear

On 05/13/2016 11:21 AM, David Miller wrote:

From: Ben Greear 
Date: Fri, 13 May 2016 09:57:19 -0700

How do you feel about a new socket-option to allow a socket to
request the old veth behaviour?

I depend upon the opinions of the experts who work upstream on and
maintain these components, since it is an area I am not so familiar
with.

Generally speaking asking me directly for opinions on matters like
this isn't the way to go, in fact I kind of find it irritating.  It
can't all be on me.

Fair enough, thanks for your time.

Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.

2016-05-13 Thread Ben Greear


Mr Miller:

How do you feel about a new socket-option to allow a socket to
request the old veth behaviour?

Thanks,
Ben

On 04/30/2016 10:30 PM, Willy Tarreau wrote:

On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote:

On 04/30/2016 03:01 PM, Vijay Pandurangan wrote:

Consider:

- App A  sends out corrupt packets 50% of the time and discards inbound data.

(...)

How can you make a generic app C know how to do this?  The path could be,
for instance:

eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> 
vethD <-> appC

There are no sockets on vethB, but it does need to have special behaviour to 
elide
csums.  Even if appC is hacked to know how to twiddle some thing on it's veth 
port,
mucking with vethD will have no effect on vethB.

With regard to your example above, why would A corrupt packets?  My guess:

1)  It has bugs (so, fix the bugs, it could equally create incorrect data with 
proper checksums,
 so just enabling checksumming adds no useful protection.)


I agree with Ben here, what he needs is the ability for userspace to be
trusted when *forwarding* a packet. Ideally you'd only want to receive
the csum status per packet on the packet socket and pass the same value
on the vethA interface, with this status being kept when the packet
reaches vethB.

If A purposely corrupts packet, it's A's problem. It's similar to designing
a NIC which intentionally corrupts packets and reports "checksum good".

The real issue is that in order to do things right, the userspace bridge
(here, "A") would really need to pass this status. In Ben's case as he
says, bad checksum packets are dropped before reaching A, so that
simplifies the process quite a bit and that might be what causes some
confusion, but ideally we'd rather have recvmsg() and sendmsg() with
these flags.

I faced the exact same issue 3 years ago when playing with netmap, it was
slow as hell because it would lose all checksum information when packets
were passing through userland, resulting in GRO/GSO etc being disabled,
and had to modify it to let userland preserve it. That's especially
important when you have to deal with possibly corrupted packets not yet
detected in the chain because the NIC did not validate their checksums.

Willy




--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.

2016-05-13 Thread Ben Greear


Mr Miller:

How do you feel about a new socket-option to allow a socket to
request the old veth behaviour?

Thanks,
Ben

On 04/30/2016 10:30 PM, Willy Tarreau wrote:

On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote:

On 04/30/2016 03:01 PM, Vijay Pandurangan wrote:

Consider:

- App A  sends out corrupt packets 50% of the time and discards inbound data.

(...)

How can you make a generic app C know how to do this?  The path could be,
for instance:

eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> 
vethD <-> appC

There are no sockets on vethB, but it does need to have special behaviour to 
elide
csums.  Even if appC is hacked to know how to twiddle some thing on it's veth 
port,
mucking with vethD will have no effect on vethB.

With regard to your example above, why would A corrupt packets?  My guess:

1)  It has bugs (so, fix the bugs, it could equally create incorrect data with 
proper checksums,
 so just enabling checksumming adds no useful protection.)


I agree with Ben here, what he needs is the ability for userspace to be
trusted when *forwarding* a packet. Ideally you'd only want to receive
the csum status per packet on the packet socket and pass the same value
on the vethA interface, with this status being kept when the packet
reaches vethB.

If A purposely corrupts packet, it's A's problem. It's similar to designing
a NIC which intentionally corrupts packets and reports "checksum good".

The real issue is that in order to do things right, the userspace bridge
(here, "A") would really need to pass this status. In Ben's case as he
says, bad checksum packets are dropped before reaching A, so that
simplifies the process quite a bit and that might be what causes some
confusion, but ideally we'd rather have recvmsg() and sendmsg() with
these flags.

I faced the exact same issue 3 years ago when playing with netmap, it was
slow as hell because it would lose all checksum information when packets
were passing through userland, resulting in GRO/GSO etc being disabled,
and had to modify it to let userland preserve it. That's especially
important when you have to deal with possibly corrupted packets not yet
detected in the chain because the NIC did not validate their checksums.

Willy




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 03:01 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 5:52 PM, Ben Greear <gree...@candelatech.com> wrote:


Good point, so if you had:

eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <->
userspace-stub <->eth1

and user-space hub enabled this elide flag, things would work, right?
Then, it seems like what we need is a way to tell the kernel
router/bridge logic to follow elide signals in packets coming from
veth. I'm not sure what the best way to do this is because I'm less
familiar with conventions in that part of the kernel, but assuming
there's a way to do this, would it be acceptable?



You cannot receive on one veth without transmitting on the other, so
I think the elide csum logic can go on the raw-socket, and apply to packets
in the transmit-from-user-space direction.  Just allowing the socket to make
the veth behave like it used to before this patch in question should be good
enough, since that worked for us for years.  So, just an option to modify
the
ip_summed for pkts sent on a socket is probably sufficient.


I don't think this is right. Consider:

- App A  sends out corrupt packets 50% of the time and discards inbound data.
- App B doesn't care about corrupt packets and is happy to receive
them and has some way of dealing with them (special case)
- App C is a regular app, say nc or something.

In your world, where A decides what happens to data it transmits,
then
A<--veth-->B and A<---wire-->B will have the same behaviour

but

A<-- veth --> C and A<-- wire --> C will have _different_ behaviour: C
will behave incorrectly if it's connected over veth but correctly if
connected with a wire. That is a bug.

Since A cannot know what the app it's talking to will desire, I argue
that both sides of a message must be opted in to this optimization.


How can you make a generic app C know how to do this?  The path could be,
for instance:

eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> 
vethD <-> appC

There are no sockets on vethB, but it does need to have special behaviour to 
elide
csums.  Even if appC is hacked to know how to twiddle some thing on it's veth 
port,
mucking with vethD will have no effect on vethB.

With regard to your example above, why would A corrupt packets?  My guess:

1)  It has bugs (so, fix the bugs, it could equally create incorrect data with 
proper checksums,
so just enabling checksumming adds no useful protection.)

2)  It means to corrupt frames.  In that case, someone must expect that C 
should receive incorrect
frames, otherwise why bother making App-A corrupt them in the first place?

3)  You are explicitly trying to test the kernel checksum logic, so you want 
the kernel to
detect the bad checksum and throw away the packet.  In this case, just 
don't set the socket
option in appA to elide checksums and the packet will be thrown away.

Any other cases you can think of?

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 03:01 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 5:52 PM, Ben Greear  wrote:


Good point, so if you had:

eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <->
userspace-stub <->eth1

and user-space hub enabled this elide flag, things would work, right?
Then, it seems like what we need is a way to tell the kernel
router/bridge logic to follow elide signals in packets coming from
veth. I'm not sure what the best way to do this is because I'm less
familiar with conventions in that part of the kernel, but assuming
there's a way to do this, would it be acceptable?



You cannot receive on one veth without transmitting on the other, so
I think the elide csum logic can go on the raw-socket, and apply to packets
in the transmit-from-user-space direction.  Just allowing the socket to make
the veth behave like it used to before this patch in question should be good
enough, since that worked for us for years.  So, just an option to modify
the
ip_summed for pkts sent on a socket is probably sufficient.


I don't think this is right. Consider:

- App A  sends out corrupt packets 50% of the time and discards inbound data.
- App B doesn't care about corrupt packets and is happy to receive
them and has some way of dealing with them (special case)
- App C is a regular app, say nc or something.

In your world, where A decides what happens to data it transmits,
then
A<--veth-->B and A<---wire-->B will have the same behaviour

but

A<-- veth --> C and A<-- wire --> C will have _different_ behaviour: C
will behave incorrectly if it's connected over veth but correctly if
connected with a wire. That is a bug.

Since A cannot know what the app it's talking to will desire, I argue
that both sides of a message must be opted in to this optimization.


How can you make a generic app C know how to do this?  The path could be,
for instance:

eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> 
vethD <-> appC

There are no sockets on vethB, but it does need to have special behaviour to 
elide
csums.  Even if appC is hacked to know how to twiddle some thing on it's veth 
port,
mucking with vethD will have no effect on vethB.

With regard to your example above, why would A corrupt packets?  My guess:

1)  It has bugs (so, fix the bugs, it could equally create incorrect data with 
proper checksums,
so just enabling checksumming adds no useful protection.)

2)  It means to corrupt frames.  In that case, someone must expect that C 
should receive incorrect
frames, otherwise why bother making App-A corrupt them in the first place?

3)  You are explicitly trying to test the kernel checksum logic, so you want 
the kernel to
detect the bad checksum and throw away the packet.  In this case, just 
don't set the socket
option in appA to elide checksums and the packet will be thrown away.

Any other cases you can think of?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 02:36 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 5:29 PM, Ben Greear <gree...@candelatech.com> wrote:



On 04/30/2016 02:13 PM, Vijay Pandurangan wrote:


On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear <gree...@candelatech.com>
wrote:




On 04/30/2016 12:54 PM, Tom Herbert wrote:



We've put considerable effort into cleaning up the checksum interface
to make it as unambiguous as possible, please be very careful to
follow it. Broken checksum processing is really hard to detect and
debug.

CHECKSUM_UNNECESSARY means that some number of _specific_ checksums
(indicated by csum_level) have been verified to be correct in a
packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is
never right. If CHECKSUM_UNNECESSARY is set in such a manner but the
checksum it would refer to has not been verified and is incorrect this
is a major bug.




Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

Then, I want to transmit it on a veth interface using a second
packet socket.  I do not want veth to recalculate the checksum on
transmit, nor to validate it on the peer veth on receive, because I do
not want to waste the CPU cycles.  I am assuming that my app is not
accidentally corrupting frames, so the checksum can never be bad.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?




It seems like that only the receiver should decide whether or not to
checksum packets on the veth, not the sender.

How about:

We could add a receiving socket option for "don't checksum packets
received from a veth when the other side has marked them as
elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending
socket option for "mark all data sent via this socket to a veth as
elide-checksum-suggested".

So the process would be:

Writer:
1. open read socket
2. open write socket, with option elide-checksum-for-veth-suggested
3. write data

Reader:
1. open read socket with "follow-elide-checksum-suggestions-on-veth"
2. read data

The kernel / module would then need to persist the flag on all packets
that traverse a veth, and drop these data when they leave the veth
module.



I'm not sure this works completely.  In my app, the packet flow might be:

eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB
<-> [kernel router/bridge logic ...] <-> eth1


Good point, so if you had:

eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <->
userspace-stub <->eth1

and user-space hub enabled this elide flag, things would work, right?
Then, it seems like what we need is a way to tell the kernel
router/bridge logic to follow elide signals in packets coming from
veth. I'm not sure what the best way to do this is because I'm less
familiar with conventions in that part of the kernel, but assuming
there's a way to do this, would it be acceptable?


You cannot receive on one veth without transmitting on the other, so
I think the elide csum logic can go on the raw-socket, and apply to packets
in the transmit-from-user-space direction.  Just allowing the socket to make
the veth behave like it used to before this patch in question should be good
enough, since that worked for us for years.  So, just an option to modify the
ip_summed for pkts sent on a socket is probably sufficient.


There may be no sockets on the vethB port.  And reader/writer is not
a good way to look at it since I am implementing a bi-directional bridge in
user-space and each packet-socket is for both rx and tx.


Sure, but we could model a bidrectional connection as two
unidirectional sockets for our discussions here, right?


Best not to I think, you want to make sure that one socket can
correctly handle tx and rx.  As long as that works, then using
uni-directional sockets should work too.

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 02:36 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 5:29 PM, Ben Greear  wrote:



On 04/30/2016 02:13 PM, Vijay Pandurangan wrote:


On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear 
wrote:




On 04/30/2016 12:54 PM, Tom Herbert wrote:



We've put considerable effort into cleaning up the checksum interface
to make it as unambiguous as possible, please be very careful to
follow it. Broken checksum processing is really hard to detect and
debug.

CHECKSUM_UNNECESSARY means that some number of _specific_ checksums
(indicated by csum_level) have been verified to be correct in a
packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is
never right. If CHECKSUM_UNNECESSARY is set in such a manner but the
checksum it would refer to has not been verified and is incorrect this
is a major bug.




Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

Then, I want to transmit it on a veth interface using a second
packet socket.  I do not want veth to recalculate the checksum on
transmit, nor to validate it on the peer veth on receive, because I do
not want to waste the CPU cycles.  I am assuming that my app is not
accidentally corrupting frames, so the checksum can never be bad.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?




It seems like that only the receiver should decide whether or not to
checksum packets on the veth, not the sender.

How about:

We could add a receiving socket option for "don't checksum packets
received from a veth when the other side has marked them as
elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending
socket option for "mark all data sent via this socket to a veth as
elide-checksum-suggested".

So the process would be:

Writer:
1. open read socket
2. open write socket, with option elide-checksum-for-veth-suggested
3. write data

Reader:
1. open read socket with "follow-elide-checksum-suggestions-on-veth"
2. read data

The kernel / module would then need to persist the flag on all packets
that traverse a veth, and drop these data when they leave the veth
module.



I'm not sure this works completely.  In my app, the packet flow might be:

eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB
<-> [kernel router/bridge logic ...] <-> eth1


Good point, so if you had:

eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <->
userspace-stub <->eth1

and user-space hub enabled this elide flag, things would work, right?
Then, it seems like what we need is a way to tell the kernel
router/bridge logic to follow elide signals in packets coming from
veth. I'm not sure what the best way to do this is because I'm less
familiar with conventions in that part of the kernel, but assuming
there's a way to do this, would it be acceptable?


You cannot receive on one veth without transmitting on the other, so
I think the elide csum logic can go on the raw-socket, and apply to packets
in the transmit-from-user-space direction.  Just allowing the socket to make
the veth behave like it used to before this patch in question should be good
enough, since that worked for us for years.  So, just an option to modify the
ip_summed for pkts sent on a socket is probably sufficient.


There may be no sockets on the vethB port.  And reader/writer is not
a good way to look at it since I am implementing a bi-directional bridge in
user-space and each packet-socket is for both rx and tx.


Sure, but we could model a bidrectional connection as two
unidirectional sockets for our discussions here, right?


Best not to I think, you want to make sure that one socket can
correctly handle tx and rx.  As long as that works, then using
uni-directional sockets should work too.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 02:13 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear <gree...@candelatech.com> wrote:



On 04/30/2016 12:54 PM, Tom Herbert wrote:


We've put considerable effort into cleaning up the checksum interface
to make it as unambiguous as possible, please be very careful to
follow it. Broken checksum processing is really hard to detect and
debug.

CHECKSUM_UNNECESSARY means that some number of _specific_ checksums
(indicated by csum_level) have been verified to be correct in a
packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is
never right. If CHECKSUM_UNNECESSARY is set in such a manner but the
checksum it would refer to has not been verified and is incorrect this
is a major bug.



Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

Then, I want to transmit it on a veth interface using a second
packet socket.  I do not want veth to recalculate the checksum on
transmit, nor to validate it on the peer veth on receive, because I do
not want to waste the CPU cycles.  I am assuming that my app is not
accidentally corrupting frames, so the checksum can never be bad.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?



It seems like that only the receiver should decide whether or not to
checksum packets on the veth, not the sender.

How about:

We could add a receiving socket option for "don't checksum packets
received from a veth when the other side has marked them as
elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending
socket option for "mark all data sent via this socket to a veth as
elide-checksum-suggested".

So the process would be:

Writer:
1. open read socket
2. open write socket, with option elide-checksum-for-veth-suggested
3. write data

Reader:
1. open read socket with "follow-elide-checksum-suggestions-on-veth"
2. read data

The kernel / module would then need to persist the flag on all packets
that traverse a veth, and drop these data when they leave the veth
module.


I'm not sure this works completely.  In my app, the packet flow might be:

eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> 
[kernel router/bridge logic ...] <-> eth1

There may be no sockets on the vethB port.  And reader/writer is not
a good way to look at it since I am implementing a bi-directional bridge in
user-space and each packet-socket is for both rx and tx.


Also, I might want to send raw frames that do have
broken checksums (lets assume a real NIC, not veth), and I want them
to hit the wire with those bad checksums.


How do I configure the checksumming in this case?



Correct me if I'm wrong but I think this is already possible now. You
can have packets with incorrect checksum hitting the wire as is. What
you cannot do is instruct the receiving end to ignore the checksum
from the sending end when using a physical device (and something I
think we should mimic on the sending device).


Yes, it does work currently (or, last I checked)...I just want to make sure it 
keeps working.

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 02:13 PM, Vijay Pandurangan wrote:

On Sat, Apr 30, 2016 at 4:59 PM, Ben Greear  wrote:



On 04/30/2016 12:54 PM, Tom Herbert wrote:


We've put considerable effort into cleaning up the checksum interface
to make it as unambiguous as possible, please be very careful to
follow it. Broken checksum processing is really hard to detect and
debug.

CHECKSUM_UNNECESSARY means that some number of _specific_ checksums
(indicated by csum_level) have been verified to be correct in a
packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is
never right. If CHECKSUM_UNNECESSARY is set in such a manner but the
checksum it would refer to has not been verified and is incorrect this
is a major bug.



Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

Then, I want to transmit it on a veth interface using a second
packet socket.  I do not want veth to recalculate the checksum on
transmit, nor to validate it on the peer veth on receive, because I do
not want to waste the CPU cycles.  I am assuming that my app is not
accidentally corrupting frames, so the checksum can never be bad.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?



It seems like that only the receiver should decide whether or not to
checksum packets on the veth, not the sender.

How about:

We could add a receiving socket option for "don't checksum packets
received from a veth when the other side has marked them as
elide-checksum-suggested" (similar to UDP_NOCHECKSUM), and a sending
socket option for "mark all data sent via this socket to a veth as
elide-checksum-suggested".

So the process would be:

Writer:
1. open read socket
2. open write socket, with option elide-checksum-for-veth-suggested
3. write data

Reader:
1. open read socket with "follow-elide-checksum-suggestions-on-veth"
2. read data

The kernel / module would then need to persist the flag on all packets
that traverse a veth, and drop these data when they leave the veth
module.


I'm not sure this works completely.  In my app, the packet flow might be:

eth0 <-> raw-socket <-> user-space-bridge <-> raw-socket <-> vethA <-> vethB <-> 
[kernel router/bridge logic ...] <-> eth1

There may be no sockets on the vethB port.  And reader/writer is not
a good way to look at it since I am implementing a bi-directional bridge in
user-space and each packet-socket is for both rx and tx.


Also, I might want to send raw frames that do have
broken checksums (lets assume a real NIC, not veth), and I want them
to hit the wire with those bad checksums.


How do I configure the checksumming in this case?



Correct me if I'm wrong but I think this is already possible now. You
can have packets with incorrect checksum hitting the wire as is. What
you cannot do is instruct the receiving end to ignore the checksum
from the sending end when using a physical device (and something I
think we should mimic on the sending device).


Yes, it does work currently (or, last I checked)...I just want to make sure it 
keeps working.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear

On 04/30/2016 12:54 PM, Tom Herbert wrote:

We've put considerable effort into cleaning up the checksum interface
to make it as unambiguous as possible, please be very careful to
follow it. Broken checksum processing is really hard to detect and
debug.

CHECKSUM_UNNECESSARY means that some number of _specific_ checksums
(indicated by csum_level) have been verified to be correct in a
packet. Blindly promoting CHECKSUM_NONE to CHECKSUM_UNNECESSARY is
never right. If CHECKSUM_UNNECESSARY is set in such a manner but the
checksum it would refer to has not been verified and is incorrect this
is a major bug.

Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

Then, I want to transmit it on a veth interface using a second
packet socket. I do not want veth to recalculate the checksum on
transmit, nor to validate it on the peer veth on receive, because I do
not want to waste the CPU cycles. I am assuming that my app is not
accidentally corrupting frames, so the checksum can never be bad.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?

Also, I might want to send raw frames that do have
broken checksums (lets assume a real NIC, not veth), and I want them
to hit the wire with those bad checksums.

How do I configure the checksumming in this case?

Thanks,
Ben

Tom

On Sat, Apr 30, 2016 at 12:40 PM, Ben Greear <gree...@candelatech.com> wrote:

On 04/30/2016 11:33 AM, Ben Hutchings wrote:

On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote:

Hello,

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

[...]

It really is. Even though the old behaviour was a bug (raw packets
should not be changed), if there are real applications that depend on
that then we have to keep those applications working somehow.

To be honest, I fail to see why the old behaviour is a bug when sending
raw packets from user-space. If raw packets should not be changed, then
we need some way to specify what the checksum setting is to begin with,
otherwise, user-space has not enough control.

A socket option for new programs, and sysctl configurable defaults for raw
sockets
for old binary programs would be sufficient I think.

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear

On 04/30/2016 12:54 PM, Tom Herbert wrote:

Suppose I know that the packet received on a packet-socket has
already been verified by a NIC that supports hardware checksumming.

How should the checksumming be configured for the packets going into
the packet-socket from user-space?

Also, I might want to send raw frames that do have
broken checksums (lets assume a real NIC, not veth), and I want them
to hit the wire with those bad checksums.

How do I configure the checksumming in this case?

Thanks,
Ben

Tom

On Sat, Apr 30, 2016 at 12:40 PM, Ben Greear wrote:

On 04/30/2016 11:33 AM, Ben Hutchings wrote:

On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote:

Hello,

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

[...]

It really is. Even though the old behaviour was a bug (raw packets
should not be changed), if there are real applications that depend on
that then we have to keep those applications working somehow.

A socket option for new programs, and sysctl configurable defaults for raw
sockets
for old binary programs would be sufficient I think.

Thanks,
Ben

--
Ben Greear
Candela Technologies Inc http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 11:33 AM, Ben Hutchings wrote:

On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote:

Hello,



http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

[...]

It really is.  Even though the old behaviour was a bug (raw packets
should not be changed), if there are real applications that depend on
that then we have to keep those applications working somehow.


To be honest, I fail to see why the old behaviour is a bug when sending
raw packets from user-space.  If raw packets should not be changed, then
we need some way to specify what the checksum setting is to begin with,
otherwise, user-space has not enough control.

A socket option for new programs, and sysctl configurable defaults for raw 
sockets
for old binary programs would be sufficient I think.

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Ben Greear




On 04/30/2016 11:33 AM, Ben Hutchings wrote:

On Thu, 2016-04-28 at 12:29 +0200, Sabrina Dubroca wrote:

Hello,



http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

[...]

It really is.  Even though the old behaviour was a bug (raw packets
should not be changed), if there are real applications that depend on
that then we have to keep those applications working somehow.


To be honest, I fail to see why the old behaviour is a bug when sending
raw packets from user-space.  If raw packets should not be changed, then
we need some way to specify what the checksum setting is to begin with,
otherwise, user-space has not enough control.

A socket option for new programs, and sysctl configurable defaults for raw 
sockets
for old binary programs would be sufficient I think.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-28 Thread Ben Greear

On 04/28/2016 03:29 AM, Sabrina Dubroca wrote:

Hello,

2016-04-27, 17:14:44 -0700, Ben Greear wrote:

On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote:

Hi Ben,

On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote:

On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote:

On 04/26/2016 04:02 PM, Ben Hutchings wrote:

3.2.80-rc1 review patch. If anyone has any objections, please let me know.

I would be careful about this. It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

[...]

OK, I'll drop this for now.

The fall out from not having this patch is in my opinion a bigger
fallout than not having this patch. This patch fixes silent data
corruption vs. the problem Ben Greear is talking about, which might not
be that a common usage.

What do others think?

Bye,
Hannes

This patch from Cong Wang seems to fix the regression for me, I think it should
be added and
tested in the main tree, and then apply them to stable as a pair.

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

If you capture packets on a device with checksum offloading enabled,
the TCP/UDP checksum isn't filled. veth also behaves that way. What
the "veth: don't modify ip_summed" patch does is enable proper
checksum validation on veth. This really was a bug in veth.

Cong's patch would also break cases where we choose to inject packets
with invalid checksums, and they would now be accepted as correct.

Your use case is invalid, it just happened to work because of a
bug. If you want the stack to fill checksums so that you want capture
and reinject packets, you have to disable checksum offloading (or
compute the checksum yourself in userspace).

Disabling checksum offloading or computing in user-space (and then
recomputing in veth to verify the checksum?) is a huge performance loss.

Maybe we could add a socket option to enable Cong's patch on a per-socket
basis? That way my use-case can still work and you can have this new
behaviour by default?

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-28 Thread Ben Greear

On 04/28/2016 03:29 AM, Sabrina Dubroca wrote:

Hello,

2016-04-27, 17:14:44 -0700, Ben Greear wrote:

On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote:

Hi Ben,

On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote:

On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote:

On 04/26/2016 04:02 PM, Ben Hutchings wrote:

3.2.80-rc1 review patch. If anyone has any objections, please let me know.

I would be careful about this. It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

[...]

OK, I'll drop this for now.

What do others think?

Bye,
Hannes

This patch from Cong Wang seems to fix the regression for me, I think it should
be added and
tested in the main tree, and then apply them to stable as a pair.

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a

Actually, no, this is not really a regression.

Cong's patch would also break cases where we choose to inject packets
with invalid checksums, and they would now be accepted as correct.

Disabling checksum offloading or computing in user-space (and then
recomputing in veth to verify the checksum?) is a huge performance loss.

Maybe we could add a socket option to enable Cong's patch on a per-socket
basis? That way my use-case can still work and you can have this new
behaviour by default?

Thanks,
Ben

--
Ben Greear
Candela Technologies Inc http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-27 Thread Ben Greear


On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote:

Hi Ben,

On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote:

On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote:

On 04/26/2016 04:02 PM, Ben Hutchings wrote:


3.2.80-rc1 review patch.  If anyone has any objections, please let me know.

I would be careful about this.  It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

[...]

OK, I'll drop this for now.


The fall out from not having this patch is in my opinion a bigger
fallout than not having this patch. This patch fixes silent data
corruption vs. the problem Ben Greear is talking about, which might not
be that a common usage.

What do others think?

Bye,
Hannes



This patch from Cong Wang seems to fix the regression for me, I think it should 
be added and
tested in the main tree, and then apply them to stable as a pair.

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a



diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index da1ae0e..f8cc758 100644 (file)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1926,6 +1926,7 @@ retry:
goto out_unlock;
}

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = sk->sk_priority;
@@ -2352,6 +2353,7 @@ static int tpacket_fill_skb(struct packet_sock *po, 
struct sk_buff *skb,

ph.raw = frame;

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = po->sk.sk_priority;
@@ -2776,6 +2778,7 @@ static int packet_snd(struct socket *sock, struct msghdr 
*msg, size_t len)
goto out_free;
}

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = sk->sk_priority;

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-27 Thread Ben Greear


On 04/27/2016 05:00 PM, Hannes Frederic Sowa wrote:

Hi Ben,

On Wed, Apr 27, 2016, at 20:07, Ben Hutchings wrote:

On Wed, 2016-04-27 at 08:59 -0700, Ben Greear wrote:

On 04/26/2016 04:02 PM, Ben Hutchings wrote:


3.2.80-rc1 review patch.  If anyone has any objections, please let me know.

I would be careful about this.  It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

[...]

OK, I'll drop this for now.


The fall out from not having this patch is in my opinion a bigger
fallout than not having this patch. This patch fixes silent data
corruption vs. the problem Ben Greear is talking about, which might not
be that a common usage.

What do others think?

Bye,
Hannes



This patch from Cong Wang seems to fix the regression for me, I think it should 
be added and
tested in the main tree, and then apply them to stable as a pair.

http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=commitdiff;h=8153e983c0e5eba1aafe1fc296248ed2a553f1ac;hp=454b07405d694dad52e7f41af5816eed0190da8a



diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index da1ae0e..f8cc758 100644 (file)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1926,6 +1926,7 @@ retry:
goto out_unlock;
}

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = sk->sk_priority;
@@ -2352,6 +2353,7 @@ static int tpacket_fill_skb(struct packet_sock *po, 
struct sk_buff *skb,

ph.raw = frame;

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = po->sk.sk_priority;
@@ -2776,6 +2778,7 @@ static int packet_snd(struct socket *sock, struct msghdr 
*msg, size_t len)
goto out_free;
}

+   skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->protocol = proto;
skb->dev = dev;
skb->priority = sk->sk_priority;

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-27 Thread Ben Greear


On 04/26/2016 04:02 PM, Ben Hutchings wrote:

3.2.80-rc1 review patch.  If anyone has any objections, please let me know.


I would be careful about this.  It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

Thanks,
Ben



--

From: Vijay Pandurangan <vij...@vijayp.ca>

[ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ]

Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
 ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones <e...@evanjones.ca>
Signed-off-by: Evan Jones <e...@evanjones.ca>
Cc: Nicolas Dichtel <nicolas.dich...@6wind.com>
Cc: Phil Sutter <p...@nwl.cc>
Cc: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp>
Cc: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan <vij...@vijayp.ca>
Acked-by: Cong Wang <cw...@twopensource.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <b...@decadent.org.uk>
---
  drivers/net/veth.c | 6 --
  1 file changed, 6 deletions(-)

--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b
stats = this_cpu_ptr(priv->stats);
rcv_stats = this_cpu_ptr(rcv_priv->stats);

-   /* don't change ip_summed == CHECKSUM_PARTIAL, as that
-  will cause bad checksum on forwarded packets */
-   if (skb->ip_summed == CHECKSUM_NONE &&
-   rcv->features & NETIF_F_RXCSUM)
-   skb->ip_summed = CHECKSUM_UNNECESSARY;

length = skb->len;
if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS)




--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-27 Thread Ben Greear


On 04/26/2016 04:02 PM, Ben Hutchings wrote:

3.2.80-rc1 review patch.  If anyone has any objections, please let me know.


I would be careful about this.  It causes regressions when sending
PACKET_SOCKET buffers from user-space to veth devices.

There was a proposed upstream fix for the regression, but it has not gone
into the tree as far as I know.

http://www.spinics.net/lists/netdev/msg370436.html

Thanks,
Ben



--

From: Vijay Pandurangan 

[ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ]

Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
 ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones 
Signed-off-by: Evan Jones 
Cc: Nicolas Dichtel 
Cc: Phil Sutter 
Cc: Toshiaki Makita 
Cc: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan 
Acked-by: Cong Wang 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
  drivers/net/veth.c | 6 --
  1 file changed, 6 deletions(-)

--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b
stats = this_cpu_ptr(priv->stats);
rcv_stats = this_cpu_ptr(rcv_priv->stats);

-   /* don't change ip_summed == CHECKSUM_PARTIAL, as that
-  will cause bad checksum on forwarded packets */
-   if (skb->ip_summed == CHECKSUM_NONE &&
-   rcv->features & NETIF_F_RXCSUM)
-   skb->ip_summed = CHECKSUM_UNNECESSARY;

length = skb->len;
if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS)




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 01:29 PM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 13:23 -0700, Ben Greear wrote:


It worked well enough for years that I didn't even know other algorithms were
available.  It was broken around 4.0 time, and I reported it to the list,
and no one seemed to really care enough to do anything about it.  I changed
to reno and ignored the problem as well.

It is trivially easy to see the regression when using ath10k NIC, and from this 
email
thread, I guess other NICs have similar issues.


Since it is so trivial, why don't you start a bisection ?


I vaguely remember doing a bisect, but I can't find any email about
that, so maybe I didn't.  At any rate, it is somewhere between 3.17 and 4.0.
From memory, it was between 3.19 and 4.0, but I am not certain of that.

Neil's suggestion, from the thread below, is that it was likely:  "605ad7f tcp: 
refine TSO autosizing"

Here is previous email thread:

https://www.mail-archive.com/netdev@vger.kernel.org/msg80803.html

This one has a link to a pcap I made at the time:

https://www.mail-archive.com/netdev@vger.kernel.org/msg80890.html



I asked a capture, I did not say ' switch to Reno or whatever ', right ?

Guessing is nice, but investigating and fixing is better.

Do not assume that nothing can be done, please ?


Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 01:29 PM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 13:23 -0700, Ben Greear wrote:


It worked well enough for years that I didn't even know other algorithms were
available.  It was broken around 4.0 time, and I reported it to the list,
and no one seemed to really care enough to do anything about it.  I changed
to reno and ignored the problem as well.

It is trivially easy to see the regression when using ath10k NIC, and from this 
email
thread, I guess other NICs have similar issues.


Since it is so trivial, why don't you start a bisection ?


I vaguely remember doing a bisect, but I can't find any email about
that, so maybe I didn't.  At any rate, it is somewhere between 3.17 and 4.0.
From memory, it was between 3.19 and 4.0, but I am not certain of that.

Neil's suggestion, from the thread below, is that it was likely:  "605ad7f tcp: 
refine TSO autosizing"

Here is previous email thread:

https://www.mail-archive.com/netdev@vger.kernel.org/msg80803.html

This one has a link to a pcap I made at the time:

https://www.mail-archive.com/netdev@vger.kernel.org/msg80890.html



I asked a capture, I did not say ' switch to Reno or whatever ', right ?

Guessing is nice, but investigating and fixing is better.

Do not assume that nothing can be done, please ?


Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 01:17 PM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 13:11 -0700, Ben Greear wrote:

On 04/12/2016 12:31 PM, Machani, Yaniv wrote:

On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote:

On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:




If you are using 'Cubic' TCP congestion control, then please try
something different.
It was broken last I checked, at least when used with the ath10k driver.



Thanks Ben, this indeed seems to be the issue !
Switching to reno got me to max throughput instantly.

I'm still looking through the thread you have shared, but from what I 
understand there is no planned fix for it ?


I think at the time it was blamed on ath10k and no one cared to try to fix it.

Or, maybe no one really uses CUBIC anymore?

Either way, I have no plans to try to fix CUBIC, but maybe someone who knows
this code better could give it a try.


Well, cubic seems to work in many cases, assuming they are not too many
drops.

Assuming one flow can get nominal speed in few RTT is kind a dream, and
so far nobody claimed a CC was able to do that, while still being fair
and resilient.

TCP CC are full of heuristics, and by definition heuristics that were
working 6 years ago might need to be refreshed.

We are still maintaining Cubic for sure.


It worked well enough for years that I didn't even know other algorithms were
available.  It was broken around 4.0 time, and I reported it to the list,
and no one seemed to really care enough to do anything about it.  I changed
to reno and ignored the problem as well.

It is trivially easy to see the regression when using ath10k NIC, and from this 
email
thread, I guess other NICs have similar issues.

Thanks,
Ben


--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 01:17 PM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 13:11 -0700, Ben Greear wrote:

On 04/12/2016 12:31 PM, Machani, Yaniv wrote:

On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote:

On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:




If you are using 'Cubic' TCP congestion control, then please try
something different.
It was broken last I checked, at least when used with the ath10k driver.



Thanks Ben, this indeed seems to be the issue !
Switching to reno got me to max throughput instantly.

I'm still looking through the thread you have shared, but from what I 
understand there is no planned fix for it ?


I think at the time it was blamed on ath10k and no one cared to try to fix it.

Or, maybe no one really uses CUBIC anymore?

Either way, I have no plans to try to fix CUBIC, but maybe someone who knows
this code better could give it a try.


Well, cubic seems to work in many cases, assuming they are not too many
drops.

Assuming one flow can get nominal speed in few RTT is kind a dream, and
so far nobody claimed a CC was able to do that, while still being fair
and resilient.

TCP CC are full of heuristics, and by definition heuristics that were
working 6 years ago might need to be refreshed.

We are still maintaining Cubic for sure.


It worked well enough for years that I didn't even know other algorithms were
available.  It was broken around 4.0 time, and I reported it to the list,
and no one seemed to really care enough to do anything about it.  I changed
to reno and ignored the problem as well.

It is trivially easy to see the regression when using ath10k NIC, and from this 
email
thread, I guess other NICs have similar issues.

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 12:31 PM, Machani, Yaniv wrote:

On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote:

On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:




If you are using 'Cubic' TCP congestion control, then please try
something different.
It was broken last I checked, at least when used with the ath10k driver.



Thanks Ben, this indeed seems to be the issue !
Switching to reno got me to max throughput instantly.

I'm still looking through the thread you have shared, but from what I 
understand there is no planned fix for it ?


I think at the time it was blamed on ath10k and no one cared to try to fix it.

Or, maybe no one really uses CUBIC anymore?

Either way, I have no plans to try to fix CUBIC, but maybe someone who knows
this code better could give it a try.

Thanks,
Ben


--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 12:31 PM, Machani, Yaniv wrote:

On Tue, Apr 12, 2016 at 18:04:52, Ben Greear wrote:

On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:




If you are using 'Cubic' TCP congestion control, then please try
something different.
It was broken last I checked, at least when used with the ath10k driver.



Thanks Ben, this indeed seems to be the issue !
Switching to reno got me to max throughput instantly.

I'm still looking through the thread you have shared, but from what I 
understand there is no planned fix for it ?


I think at the time it was blamed on ath10k and no one cared to try to fix it.

Or, maybe no one really uses CUBIC anymore?

Either way, I have no plans to try to fix CUBIC, but maybe someone who knows
this code better could give it a try.

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:

Hi,
After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance 
degradation over Wi-Fi.
In 3.14 kernel, TCP got to its max throughout after less than a second, while 
in the 4.4  it is taking ~20-30 seconds.
UDP TX/RX and TCP RX performance is as expected.
We are using a Beagle Bone Black and a WiLink8 device.

Were there any related changes that might cause such behavior ?
Kernel configuration and sysctl values were compared, but no significant 
differences have been found.


If you are using 'Cubic' TCP congestion control, then please try something 
different.
It was broken last I checked, at least when used with the ath10k driver.

https://marc.info/?l=linux-netdev=144405216005715=2

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: TCP reaching to maximum throughput after a long time

2016-04-12 Thread Ben Greear


On 04/12/2016 07:52 AM, Eric Dumazet wrote:

On Tue, 2016-04-12 at 12:17 +, Machani, Yaniv wrote:

Hi,
After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance 
degradation over Wi-Fi.
In 3.14 kernel, TCP got to its max throughout after less than a second, while 
in the 4.4  it is taking ~20-30 seconds.
UDP TX/RX and TCP RX performance is as expected.
We are using a Beagle Bone Black and a WiLink8 device.

Were there any related changes that might cause such behavior ?
Kernel configuration and sysctl values were compared, but no significant 
differences have been found.


If you are using 'Cubic' TCP congestion control, then please try something 
different.
It was broken last I checked, at least when used with the ath10k driver.

https://marc.info/?l=linux-netdev=144405216005715=2

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Deadlock related to file permissions and/or cgroup, 4.4.6+

2016-04-07 Thread Ben Greear

W  O4.4.6+ #23
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
llvmpipe-2  D 880068d2fbc0 0  1356  1 0x
 880068d2fbc0 00ff880069b5d3c8 88006bc9ff98 88006bc9ff80
 880069b5cc08 880066bfcc00 880069b5cc00 880068d28000
 880069b5cc00 8800682e7c60 880069b5ed50 8375c620
Call Trace:
 [] schedule+0xc3/0xe0
 [] rwsem_down_read_failed+0x190/0x1f9
 [] ? rt_mutex_futex_unlock+0x28/0x28
 [] ? tlb_flush_mmu_free+0x24/0x6a
 [] ? match_held_lock+0x30/0x103
 [] call_rwsem_down_read_failed+0x14/0x30
 [] ? call_rwsem_down_read_failed+0x14/0x30
 [] ? exit_signals+0x80/0x1ef
 [] ? percpu_down_read+0x60/0x72
 [] exit_signals+0x80/0x1ef
 [] ? get_signal+0x8a2/0x8a2
 [] ? __blocking_notifier_call_chain+0x2e/0x73
 [] do_exit+0x224/0x1157
 [] ? swapin_walk_pmd_entry+0x1c3/0x1c3
 [] ? release_task+0x738/0x738
 [] ? SyS_futex+0x152/0x1ee
 [] ? do_futex+0xb4d/0xb4d
 [] ? mark_held_locks+0x2d/0x90
 [] ? lockdep_sys_exit+0x1a/0x91
 [] ? lockdep_sys_exit_thunk+0x12/0x14
 [] SyS_exit+0x1d/0x1d
 [] entry_SYSCALL_64_fastpath+0x16/0x7a
1 lock held by llvmpipe-2/1356:
 #0:  (_threadgroup_rwsem){++}, at: [] 
exit_signals+0x80/0x1ef


Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Deadlock related to file permissions and/or cgroup, 4.4.6+

2016-04-07 Thread Ben Greear

W  O4.4.6+ #23
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
llvmpipe-2  D 880068d2fbc0 0  1356  1 0x
 880068d2fbc0 00ff880069b5d3c8 88006bc9ff98 88006bc9ff80
 880069b5cc08 880066bfcc00 880069b5cc00 880068d28000
 880069b5cc00 8800682e7c60 880069b5ed50 8375c620
Call Trace:
 [] schedule+0xc3/0xe0
 [] rwsem_down_read_failed+0x190/0x1f9
 [] ? rt_mutex_futex_unlock+0x28/0x28
 [] ? tlb_flush_mmu_free+0x24/0x6a
 [] ? match_held_lock+0x30/0x103
 [] call_rwsem_down_read_failed+0x14/0x30
 [] ? call_rwsem_down_read_failed+0x14/0x30
 [] ? exit_signals+0x80/0x1ef
 [] ? percpu_down_read+0x60/0x72
 [] exit_signals+0x80/0x1ef
 [] ? get_signal+0x8a2/0x8a2
 [] ? __blocking_notifier_call_chain+0x2e/0x73
 [] do_exit+0x224/0x1157
 [] ? swapin_walk_pmd_entry+0x1c3/0x1c3
 [] ? release_task+0x738/0x738
 [] ? SyS_futex+0x152/0x1ee
 [] ? do_futex+0xb4d/0xb4d
 [] ? mark_held_locks+0x2d/0x90
 [] ? lockdep_sys_exit+0x1a/0x91
 [] ? lockdep_sys_exit_thunk+0x12/0x14
 [] SyS_exit+0x1d/0x1d
 [] entry_SYSCALL_64_fastpath+0x16/0x7a
1 lock held by llvmpipe-2/1356:
 #0:  (_threadgroup_rwsem){++}, at: [] 
exit_signals+0x80/0x1ef


Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-04-01 Thread Ben Greear


On 03/31/2016 05:46 PM, Herbert Xu wrote:

On Thu, Mar 31, 2016 at 05:29:59PM +0200, Johannes Berg wrote:


Does removing this completely disable the "-EEXIST" error? I can't say
I fully understand the elasticity stuff in __rhashtable_insert_fast().


What EEXIST error are you talking about? The only one that can be
returned on insertion is if you're explicitly checking for dups
which clearly can't be the case for you.

If you're talking about the EEXIST error due to a rehash then it is
completely hidden from you by rhashtable_insert_rehash.

If you actually meant EBUSY then yes this should prevent it from
occurring, unless your chain-length exceeds 2^32.


EEXIST was on removal, and was a symptom of the failure to insert, not
really a problem itself.

I reverted my revert (ie, back to rhashtable), added Johanne's patch
to check insertion (and added my on pr_err there).

I also added this:

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 38ef0be..c25b945 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -66,6 +66,7 @@

 static const struct rhashtable_params sta_rht_params = {
.nelem_hint = 3, /* start small */
+   .insecure_elasticity = true, /* Disable chain-length checks. */
.automatic_shrinking = true,
.head_offset = offsetof(struct sta_info, hash_node),
.key_offset = offsetof(struct sta_info, addr),


Now, my test case seems to pass, though I did have one strange issue
before I put in the pr_err.  I'm not sure if it was a hashtable issue
or something else..but I have lots of something-else going on in this system,
so I'd say that likely the patch above fixes rhashtable for my use case.

I will of course let you know if I run into more issues that appear
to be hashtable related!

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-04-01 Thread Ben Greear


On 03/31/2016 05:46 PM, Herbert Xu wrote:

On Thu, Mar 31, 2016 at 05:29:59PM +0200, Johannes Berg wrote:


Does removing this completely disable the "-EEXIST" error? I can't say
I fully understand the elasticity stuff in __rhashtable_insert_fast().


What EEXIST error are you talking about? The only one that can be
returned on insertion is if you're explicitly checking for dups
which clearly can't be the case for you.

If you're talking about the EEXIST error due to a rehash then it is
completely hidden from you by rhashtable_insert_rehash.

If you actually meant EBUSY then yes this should prevent it from
occurring, unless your chain-length exceeds 2^32.


EEXIST was on removal, and was a symptom of the failure to insert, not
really a problem itself.

I reverted my revert (ie, back to rhashtable), added Johanne's patch
to check insertion (and added my on pr_err there).

I also added this:

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 38ef0be..c25b945 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -66,6 +66,7 @@

 static const struct rhashtable_params sta_rht_params = {
.nelem_hint = 3, /* start small */
+   .insecure_elasticity = true, /* Disable chain-length checks. */
.automatic_shrinking = true,
.head_offset = offsetof(struct sta_info, hash_node),
.key_offset = offsetof(struct sta_info, addr),


Now, my test case seems to pass, though I did have one strange issue
before I put in the pr_err.  I'm not sure if it was a hashtable issue
or something else..but I have lots of something-else going on in this system,
so I'd say that likely the patch above fixes rhashtable for my use case.

I will of course let you know if I run into more issues that appear
to be hashtable related!

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-31 Thread Ben Greear




On 03/31/2016 12:46 AM, Johannes Berg wrote:

On Wed, 2016-03-30 at 09:52 -0700, Ben Greear wrote:


If someone can fix rhashtable, then great.
I read some earlier comments [1] back when someone else reported
similar problems, and the comments seemed to indicate that rhashtable
was broken in this manner on purpose to protect against hashing
attacks.

If you are baking in this type of policy to what should be a basic
data-type, then it is not useful for how it is being used in
the mac80211 stack.

[1]  http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html



That's not really saying it's purposely broken, that's more saying that
Herbert didn't see a point in fixing a case that has awful behaviour
already.

However, I'm confused now - we can much more easily live with
*insertion* failures, as the linked email indicates, than *deletion*
failures, which I think you had indicated originally. Are you really
seeing *deletion* failures?

If there are in fact *deletion* failures then I think we really need to
address those in rhashtable, no matter the worst-case behaviour of the
hashing or keys, since we should be able to delete entries in order to
get back to something reasonable. Looking at the code though, I don't
actually see that happening.

If you're seeing only *insertion* failures, which you indicated in the
root of this thread, then I think for the general case in mac80211 we
can live with that - we use a seeded jhash for the hash function, and
since in the general case we cannot accept entries with identical MAC
addresses to start with, it shouldn't be possible to run into this
problem under normal use cases.


I see insertion failure, and then later, if of course fails to delete
as well since it was never inserted to begin with.  There is no good
way to deal with insertion error, so just need to fix the hashtable.



In this case, I think perhaps you can just patch your local system with
the many interfaces connecting to the same AP to add the parameter
Herbert suggested (.insecure_elasticity = true in sta_rht_params). This
is, after all, very much a case that "normal" operation doesn't even
get close to.


Old code, even stock kernels, could deal with this properly, so I think it
should be fixed by default.  I'll put rhash back in my tree and try that 
insecure
option and see if it works.

Thanks,
Ben


johannes



--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-31 Thread Ben Greear




On 03/31/2016 12:46 AM, Johannes Berg wrote:

On Wed, 2016-03-30 at 09:52 -0700, Ben Greear wrote:


If someone can fix rhashtable, then great.
I read some earlier comments [1] back when someone else reported
similar problems, and the comments seemed to indicate that rhashtable
was broken in this manner on purpose to protect against hashing
attacks.

If you are baking in this type of policy to what should be a basic
data-type, then it is not useful for how it is being used in
the mac80211 stack.

[1]  http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html



That's not really saying it's purposely broken, that's more saying that
Herbert didn't see a point in fixing a case that has awful behaviour
already.

However, I'm confused now - we can much more easily live with
*insertion* failures, as the linked email indicates, than *deletion*
failures, which I think you had indicated originally. Are you really
seeing *deletion* failures?

If there are in fact *deletion* failures then I think we really need to
address those in rhashtable, no matter the worst-case behaviour of the
hashing or keys, since we should be able to delete entries in order to
get back to something reasonable. Looking at the code though, I don't
actually see that happening.

If you're seeing only *insertion* failures, which you indicated in the
root of this thread, then I think for the general case in mac80211 we
can live with that - we use a seeded jhash for the hash function, and
since in the general case we cannot accept entries with identical MAC
addresses to start with, it shouldn't be possible to run into this
problem under normal use cases.


I see insertion failure, and then later, if of course fails to delete
as well since it was never inserted to begin with.  There is no good
way to deal with insertion error, so just need to fix the hashtable.



In this case, I think perhaps you can just patch your local system with
the many interfaces connecting to the same AP to add the parameter
Herbert suggested (.insecure_elasticity = true in sta_rht_params). This
is, after all, very much a case that "normal" operation doesn't even
get close to.


Old code, even stock kernels, could deal with this properly, so I think it
should be fixed by default.  I'll put rhash back in my tree and try that 
insecure
option and see if it works.

Thanks,
Ben


johannes



--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-30 Thread Ben Greear

On 03/30/2016 09:38 AM, David Miller wrote:

From: Johannes Berg <johan...@sipsolutions.net>
Date: Wed, 30 Mar 2016 11:14:12 +0200

On Tue, 2016-03-29 at 09:16 -0700, Ben Greear wrote:

Looks like rhashtable has too much policy in it to properly deal with
cases where there are too many hash collisions, so I am going to work
on reverting it's use in mac80211.

I'm not really all that happy with that approach - can't we fix the
rhashtable? It's a pretty rare corner case that many keys really are
identical and no kind of hash algorithm, but it seems much better to
still deal with it than to remove the rhashtable usage and go back to
hand-rolling something.

Yeah reverting seems like a really idiotic way to deal with the issue.

If someone can fix rhashtable, then great.
I read some earlier comments [1] back when someone else reported
similar problems, and the comments seemed to indicate that rhashtable was
broken in this manner on purpose to protect against hashing attacks.

If you are baking in this type of policy to what should be a basic
data-type, then it is not useful for how it is being used in
the mac80211 stack.

[1]  http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-30 Thread Ben Greear

On 03/30/2016 09:38 AM, David Miller wrote:

From: Johannes Berg 
Date: Wed, 30 Mar 2016 11:14:12 +0200

On Tue, 2016-03-29 at 09:16 -0700, Ben Greear wrote:

Looks like rhashtable has too much policy in it to properly deal with
cases where there are too many hash collisions, so I am going to work
on reverting it's use in mac80211.

I'm not really all that happy with that approach - can't we fix the
rhashtable? It's a pretty rare corner case that many keys really are
identical and no kind of hash algorithm, but it seems much better to
still deal with it than to remove the rhashtable usage and go back to
hand-rolling something.

Yeah reverting seems like a really idiotic way to deal with the issue.

If someone can fix rhashtable, then great.
I read some earlier comments [1] back when someone else reported
similar problems, and the comments seemed to indicate that rhashtable was
broken in this manner on purpose to protect against hashing attacks.

If you are baking in this type of policy to what should be a basic
data-type, then it is not useful for how it is being used in
the mac80211 stack.

[1]  http://lkml.iu.edu/hypermail/linux/kernel/1512.2/01681.html

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-29 Thread Ben Greear


Looks like rhashtable has too much policy in it to properly deal with
cases where there are too many hash collisions, so I am going to work on
reverting it's use in mac80211.

Thanks,
Ben

On 03/28/2016 01:29 PM, Ben Greear wrote:

Hello!

I have a use case for mac80211 where I create multiple stations to
the same remote peer MAC address.

I'm seeing cases where the rhashtable logic is returning -16 (EBUSY)
on insert (see sta_info_hash_add).
This is with the 4.4.6+ (plus local patches) kernel, and it has the patch 
mentioned
here:

https://lkml.org/lkml/2015/12/3/307

If I understand the code properly, my use case is going to be worst-case 
scenario,
where all of my items in the hash have the same key (peer mac addr).

I have my own secondary hash to handle most of my hot-path lookups, but I still 
need
the main hash to at least function in a linear-search manner.

Any idea what I can do to get rid of the EBUSY return code problem, or how
to debug it further?

Thanks,
Ben




--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Re: Question on rhashtable in worst-case scenario.

2016-03-29 Thread Ben Greear


Looks like rhashtable has too much policy in it to properly deal with
cases where there are too many hash collisions, so I am going to work on
reverting it's use in mac80211.

Thanks,
Ben

On 03/28/2016 01:29 PM, Ben Greear wrote:

Hello!

I have a use case for mac80211 where I create multiple stations to
the same remote peer MAC address.

I'm seeing cases where the rhashtable logic is returning -16 (EBUSY)
on insert (see sta_info_hash_add).
This is with the 4.4.6+ (plus local patches) kernel, and it has the patch 
mentioned
here:

https://lkml.org/lkml/2015/12/3/307

If I understand the code properly, my use case is going to be worst-case 
scenario,
where all of my items in the hash have the same key (peer mac addr).

I have my own secondary hash to handle most of my hot-path lookups, but I still 
need
the main hash to at least function in a linear-search manner.

Any idea what I can do to get rid of the EBUSY return code problem, or how
to debug it further?

Thanks,
Ben




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Question on rhashtable in worst-case scenario.

2016-03-28 Thread Ben Greear


Hello!

I have a use case for mac80211 where I create multiple stations to
the same remote peer MAC address.

I'm seeing cases where the rhashtable logic is returning -16 (EBUSY)
on insert (see sta_info_hash_add).
This is with the 4.4.6+ (plus local patches) kernel, and it has the patch 
mentioned
here:

https://lkml.org/lkml/2015/12/3/307

If I understand the code properly, my use case is going to be worst-case 
scenario,
where all of my items in the hash have the same key (peer mac addr).

I have my own secondary hash to handle most of my hot-path lookups, but I still 
need
the main hash to at least function in a linear-search manner.

Any idea what I can do to get rid of the EBUSY return code problem, or how
to debug it further?

Thanks,
Ben

--
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

Question on rhashtable in worst-case scenario.

2016-03-28 Thread Ben Greear


Hello!

I have a use case for mac80211 where I create multiple stations to
the same remote peer MAC address.

I'm seeing cases where the rhashtable logic is returning -16 (EBUSY)
on insert (see sta_info_hash_add).
This is with the 4.4.6+ (plus local patches) kernel, and it has the patch 
mentioned
here:

https://lkml.org/lkml/2015/12/3/307

If I understand the code properly, my use case is going to be worst-case 
scenario,
where all of my items in the hash have the same key (peer mac addr).

I have my own secondary hash to handle most of my hot-path lookups, but I still 
need
the main hash to at least function in a linear-search manner.

Any idea what I can do to get rid of the EBUSY return code problem, or how
to debug it further?

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

Re: 3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'

2015-01-12 Thread Ben Greear

On 01/12/2015 12:55 PM, Guenter Roeck wrote:
> On Mon, Jan 12, 2015 at 12:08:22PM -0800, Ben Greear wrote:
>>
>> Any idea what is wrong?
>>
>> -rc3 compiled ok, then I rebased just now, and get this:
>>
> My auto-builders are all happy, with no build or qemu failures.
> Did you try "make mrproper" prior to rebuilding ?

yes, and blew away .ccache too.

But, I saw something weird like git suddenly got confused a bit earlier,
so I will re-clone and try again in case the 'reset --hard HEAD' didn't
clean up as much as I had hoped.

Thanks,
Ben

> 
> Guenter
> 


-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'

2015-01-12 Thread Ben Greear


Any idea what is wrong?

-rc3 compiled ok, then I rebased just now, and get this:


cd /mnt/sda/home/greearb/git/linux-3.17.dev.y/drivers/net/wireless/ath/ath10k/
cd ~/kernel/2.6/linux-3.19.x64/; make -j8 bzImage modules
  CHK include/config/kernel.release
  GEN ./Makefile
  CHK include/generated/uapi/linux/version.h
  Using /mnt/sda/home/greearb/git/linux-3.19.dev.y as source for kernel
  CHK include/generated/utsrelease.h
  CC  kernel/bounds.s
  GEN include/generated/bounds.h
  CC  arch/x86/kernel/asm-offsets.s
  GEN include/generated/asm-offsets.h
  CALL/mnt/sda/home/greearb/git/linux-3.19.dev.y/scripts/checksyscalls.sh
make[3]: *** No rule to make target `usr/missing-syscalls', needed by 
`__build'.  Stop.
make[3]: *** Waiting for unfinished jobs
  CC  kernel/bounds.s
  CHK include/generated/compile.h
  CC  init/main.o
  CC  init/do_mounts.o
  CC  init/do_mounts_rd.o
/mnt/sda/home/greearb/git/linux-3.19.dev.y/kernel/Makefile:132: *** No X.509 
certificates found ***
  CC  init/do_mounts_initrd.o
  AS  arch/x86/ia32/ia32entry.o
make[2]: *** [usr] Error 2
make[2]: *** Waiting for unfinished jobs
  CC  arch/x86/ia32/sys_ia32.o

Thanks,
Ben

-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'

2015-01-12 Thread Ben Greear


Any idea what is wrong?

-rc3 compiled ok, then I rebased just now, and get this:


cd /mnt/sda/home/greearb/git/linux-3.17.dev.y/drivers/net/wireless/ath/ath10k/
cd ~/kernel/2.6/linux-3.19.x64/; make -j8 bzImage modules
  CHK include/config/kernel.release
  GEN ./Makefile
  CHK include/generated/uapi/linux/version.h
  Using /mnt/sda/home/greearb/git/linux-3.19.dev.y as source for kernel
  CHK include/generated/utsrelease.h
  CC  kernel/bounds.s
  GEN include/generated/bounds.h
  CC  arch/x86/kernel/asm-offsets.s
  GEN include/generated/asm-offsets.h
  CALL/mnt/sda/home/greearb/git/linux-3.19.dev.y/scripts/checksyscalls.sh
make[3]: *** No rule to make target `usr/missing-syscalls', needed by 
`__build'.  Stop.
make[3]: *** Waiting for unfinished jobs
  CC  kernel/bounds.s
  CHK include/generated/compile.h
  CC  init/main.o
  CC  init/do_mounts.o
  CC  init/do_mounts_rd.o
/mnt/sda/home/greearb/git/linux-3.19.dev.y/kernel/Makefile:132: *** No X.509 
certificates found ***
  CC  init/do_mounts_initrd.o
  AS  arch/x86/ia32/ia32entry.o
make[2]: *** [usr] Error 2
make[2]: *** Waiting for unfinished jobs
  CC  arch/x86/ia32/sys_ia32.o

Thanks,
Ben

-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.19.0-rc4+ fails to compile, missing 'usr/missing-syscalls'

2015-01-12 Thread Ben Greear

On 01/12/2015 12:55 PM, Guenter Roeck wrote:
 On Mon, Jan 12, 2015 at 12:08:22PM -0800, Ben Greear wrote:

 Any idea what is wrong?

 -rc3 compiled ok, then I rebased just now, and get this:

 My auto-builders are all happy, with no build or qemu failures.
 Did you try make mrproper prior to rebuilding ?

yes, and blew away .ccache too.

But, I saw something weird like git suddenly got confused a bit earlier,
so I will re-clone and try again in case the 'reset --hard HEAD' didn't
clean up as much as I had hoped.

Thanks,
Ben

 
 Guenter
 


-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel splat in clean 3.16, related to drm/i915

2014-08-13 Thread Ben Greear

I thought I'd do some testing on a stock kernel for a change...saw this during
bootup one one of my systems:


[4.854068] [ cut here ]
[4.854154] WARNING: CPU: 0 PID: 93 at 
/home/greearb/git/linux-2.6.clean/drivers/gpu/drm/i915/intel_panel.c:758 
i9xx_enable_backlight+0xd4/0x100 [i915]()
[4.854156] backlight already enabled
[4.854166] Modules linked in: ata_generic pata_acpi i915(+) i2c_algo_bit 
video drm_kms_helper drm
[4.854172] CPU: 0 PID: 93 Comm: systemd-udevd Not tainted 3.16.0 #1
[4.854175] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To 
be filled by O.E.M., BIOS 080015  05/31/2010
[4.854183]    f759b730 c0a228f4 f759b770 f759b760 c045d24e 
f8616932
[4.854190]  f759b78c 005d f8610188 02f6 f85e8124 f85e8124 f77e 
f77d8400
[4.854197]  f75db400 f759b778 c045d2a3 0009 f759b770 f8616932 f759b78c 
f759b798
[4.854199] Call Trace:
[4.854211]  [] dump_stack+0x48/0x69
[4.854218]  [] warn_slowpath_common+0x7e/0xa0
[4.854288]  [] ? i9xx_enable_backlight+0xd4/0x100 [i915]
[4.854356]  [] ? i9xx_enable_backlight+0xd4/0x100 [i915]
[4.854361]  [] warn_slowpath_fmt+0x33/0x40
[4.854430]  [] i9xx_enable_backlight+0xd4/0x100 [i915]
[4.854500]  [] intel_panel_enable_backlight+0x77/0xc0 [i915]
[4.854568]  [] intel_enable_lvds+0x13f/0x150 [i915]
[4.854632]  [] i9xx_crtc_enable+0x36d/0x400 [i915]
[4.854696]  [] __intel_set_mode+0x6cd/0x1400 [i915]
[4.854761]  [] ? intel_modeset_check_state+0x20b/0x740 [i915]
[4.854825]  [] intel_set_mode+0x23/0x40 [i915]
[4.854889]  [] intel_crtc_set_config+0x7c4/0xba0 [i915]
[4.854916]  [] drm_mode_set_config_internal+0x4e/0xb0 [drm]
[4.854979]  [] ? intel_user_framebuffer_destroy+0xb0/0xb0 [i915]
[4.854989]  [] restore_fbdev_mode+0xaa/0xe0 [drm_kms_helper]
[4.854998]  [] 
drm_fb_helper_restore_fbdev_mode_unlocked+0x1d/0x30 [drm_kms_helper]
[4.855007]  [] drm_fb_helper_set_par+0x26/0x60 [drm_kms_helper]
[4.855014]  [] fbcon_init+0x46a/0x4b0
[4.855021]  [] visual_init+0x9e/0x100
[4.855026]  [] do_bind_con_driver+0x10a/0x2c0
[4.855032]  [] ? sysfs_create_file_ns+0x2c/0x30
[4.855038]  [] do_take_over_console+0xfd/0x190
[4.855043]  [] do_fbcon_takeover+0x5f/0xc0
[4.855048]  [] fbcon_event_notify+0x5ff/0x740
[4.855049]  [] notifier_call_chain+0x41/0x60
[4.855049]  [] __blocking_notifier_call_chain+0x3b/0x60
[4.855049]  [] blocking_notifier_call_chain+0x1f/0x30
[4.855049]  [] fb_notifier_call_chain+0x16/0x20
[4.855049]  [] register_framebuffer+0x1c7/0x2e0
[4.855049]  [] drm_fb_helper_initial_config+0x2d4/0x1020 
[drm_kms_helper]
[4.855049]  [] ? gen4_write32+0x49/0xe0 [i915]
[4.855049]  [] ? gen4_read32+0x3f/0xd0 [i915]
[4.855049]  [] ? intel_fbdev_init+0x193/0x510 [i915]
[4.855049]  [] intel_fbdev_initial_config+0x1c/0x20 [i915]
[4.855049]  [] i915_driver_load+0xe48/0xed0 [i915]
[4.855049]  [] ? i915_switcheroo_set_state+0x90/0x90 [i915]
[4.855049]  [] ? kobject_uevent_env+0x15a/0x690
[4.855049]  [] ? cleanup_uevent_env+0x10/0x10
[4.855049]  [] ? get_device+0x14/0x30
[4.855049]  [] ? klist_class_dev_get+0x12/0x20
[4.855049]  [] ? klist_node_init+0x2e/0x50
[4.855049]  [] ? klist_add_tail+0x34/0x40
[4.855049]  [] ? device_add+0x1b6/0x5c0
[4.855049]  [] ? kobject_set_name_vargs+0x42/0x60
[4.855049]  [] ? drm_sysfs_device_add+0xba/0x100 [drm]
[4.855049]  [] drm_dev_register+0x8e/0xe0 [drm]
[4.855049]  [] drm_get_pci_dev+0x79/0x1c0 [drm]
[4.855049]  [] i915_pci_probe+0x35/0x60 [i915]
[4.855049]  [] pci_device_probe+0x6f/0xc0
[4.855049]  [] ? sysfs_create_link+0x25/0x40
[4.855049]  [] driver_probe_device+0x93/0x3a0
[4.855049]  [] ? pci_match_device+0xca/0x100
[4.855049]  [] __driver_attach+0x71/0x80
[4.855049]  [] ? __device_attach+0x40/0x40
[4.855049]  [] bus_for_each_dev+0x47/0x80
[4.855049]  [] driver_attach+0x1e/0x20
[4.855049]  [] ? __device_attach+0x40/0x40
[4.855049]  [] bus_add_driver+0x157/0x230
[4.855049]  [] ? 0xf824afff
[4.855049]  [] ? 0xf824afff
[4.855049]  [] driver_register+0x59/0xe0
[4.855049]  [] ? soft_offline_page+0x3bf/0x540
[4.855049]  [] __pci_register_driver+0x32/0x40
[4.855049]  [] drm_pci_init+0xe5/0x110 [drm]
[4.855049]  [] ? 0xf824afff
[4.855049]  [] i915_init+0x88/0x8a [i915]
[4.855049]  [] ? 0xf824afff
[4.855049]  [] do_one_initcall+0xaa/0x1e0
[4.855049]  [] ? 0xf824afff
[4.855049]  [] ? kfree+0x11f/0x150
[4.855049]  [] ? __vunmap+0x8f/0xe0
[4.855049]  [] ? __vunmap+0x8f/0xe0
[4.855049]  [] ? __vunmap+0x8f/0xe0
[4.855049]  [] load_module+0x1c9f/0x21d0
[4.855049]  [] SyS_finit_module+0x75/0xc0
[4.855049]  [] ? vm_mmap_pgoff+0x7b/0xa0
[4.855049]  [] sysenter_do_call+0x12/0x12
[4.855049] ---[ end trace 9ef1310c3c12d97e ]---


Thanks,
Ben

-- 
Ben

kernel splat in clean 3.16, related to drm/i915

2014-08-13 Thread Ben Greear

 [i915]
[4.855049]  [f824b000] ? 0xf824afff
[4.855049]  [c040211a] do_one_initcall+0xaa/0x1e0
[4.855049]  [f824b000] ? 0xf824afff
[4.855049]  [c056682f] ? kfree+0x11f/0x150
[4.855049]  [c0555c3f] ? __vunmap+0x8f/0xe0
[4.855049]  [c0555c3f] ? __vunmap+0x8f/0xe0
[4.855049]  [c0555c3f] ? __vunmap+0x8f/0xe0
[4.855049]  [c04cf48f] load_module+0x1c9f/0x21d0
[4.855049]  [c04cfb25] SyS_finit_module+0x75/0xc0
[4.855049]  [c053bbfb] ? vm_mmap_pgoff+0x7b/0xa0
[4.855049]  [c0a2964c] sysenter_do_call+0x12/0x12
[4.855049] ---[ end trace 9ef1310c3c12d97e ]---


Thanks,
Ben

-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Crashes in 3.14.13+ related to anon_vma_clone.

2014-07-24 Thread Ben Greear

On 07/24/2014 04:08 PM, Ben Greear wrote:
> A few of our systems are repeatedly crashing when upgraded from
> the 3.14.6+ to 3.14.13+ kernels.  Both kernels have a fair bit
> of our out-of-tree patches, so could be our fault.

Ahh, looks like a bad merge of a stable patch with a slightly different
manually applied stable patch, causing double-free
in the wifi stack...

Sorry for the noise.

Thanks,
Ben

> 
> But, in case this is a known problem...has anyone seen crashes like
> this below?
> 
> Test case is using some ath9k APs and stations, but not certain
> that has much to do with the problem.
> 
> BUG: unable to handle kernel paging request at 003f9840b000
> IP: [] anon_vma_clone+0x88/0xf5
> PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065
> Oops: 0003 [#1] PREEMPT SMP
> Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen 
> ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo
> CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19
> Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by 
> O.E.M., BIOS 4.6.3 03/06/2012
> task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000
> RIP: 0010:[]  [] anon_vma_clone+0x88/0xf5
> RSP: 0018:8800baa3fda0  EFLAGS: 00010246
> RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101
> RDX:  RSI: 8800ba8b8380 RDI: 8800ba8b8388
> RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000
> R10: 000b R11: 00016338 R12: 
> R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380
> FS:  7f0b70511740() GS:88022bc8() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0
> Stack:
>  880221ba92a0 88020dcfc980 8800bc609450 880221ba9228
>  8800bc609450 880221ba9228 8800baa3fe08 81169b7b
>  88020dcfc980 880211db5f80 880221ba9228 8800bc609450
> Call Trace:
>  [] anon_vma_fork+0x26/0xec
>  [] copy_process.part.28+0xf98/0x16ca
>  [] do_fork+0xb4/0x211
>  [] ? __set_task_blocked+0x5e/0x64
>  [] ? _raw_spin_unlock_irq+0xc/0x1f
>  [] SyS_clone+0x11/0x13
>  [] stub_clone+0x69/0x90
>  [] ? system_call_fastpath+0x1a/0x1f
> Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 
> 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff
> RIP  [] anon_vma_clone+0x88/0xf5
>  RSP 
> CR2: 003f9840b000
> 


-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Crashes in 3.14.13+ related to anon_vma_clone.

2014-07-24 Thread Ben Greear

A few of our systems are repeatedly crashing when upgraded from
the 3.14.6+ to 3.14.13+ kernels.  Both kernels have a fair bit
of our out-of-tree patches, so could be our fault.

But, in case this is a known problem...has anyone seen crashes like
this below?

Test case is using some ath9k APs and stations, but not certain
that has much to do with the problem.

BUG: unable to handle kernel paging request at 003f9840b000
IP: [] anon_vma_clone+0x88/0xf5
PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen 
ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo
CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19
Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by 
O.E.M., BIOS 4.6.3 03/06/2012
task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000
RIP: 0010:[]  [] anon_vma_clone+0x88/0xf5
RSP: 0018:8800baa3fda0  EFLAGS: 00010246
RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101
RDX:  RSI: 8800ba8b8380 RDI: 8800ba8b8388
RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000
R10: 000b R11: 00016338 R12: 
R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380
FS:  7f0b70511740() GS:88022bc8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0
Stack:
 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228
 8800bc609450 880221ba9228 8800baa3fe08 81169b7b
 88020dcfc980 880211db5f80 880221ba9228 8800bc609450
Call Trace:
 [] anon_vma_fork+0x26/0xec
 [] copy_process.part.28+0xf98/0x16ca
 [] do_fork+0xb4/0x211
 [] ? __set_task_blocked+0x5e/0x64
 [] ? _raw_spin_unlock_irq+0xc/0x1f
 [] SyS_clone+0x11/0x13
 [] stub_clone+0x69/0x90
 [] ? system_call_fastpath+0x1a/0x1f
Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 
62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff
RIP  [] anon_vma_clone+0x88/0xf5
 RSP 
CR2: 003f9840b000

-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Crashes in 3.14.13+ related to anon_vma_clone.

2014-07-24 Thread Ben Greear

A few of our systems are repeatedly crashing when upgraded from
the 3.14.6+ to 3.14.13+ kernels.  Both kernels have a fair bit
of our out-of-tree patches, so could be our fault.

But, in case this is a known problem...has anyone seen crashes like
this below?

Test case is using some ath9k APs and stations, but not certain
that has much to do with the problem.

BUG: unable to handle kernel paging request at 003f9840b000
IP: [81169ae8] anon_vma_clone+0x88/0xf5
PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen 
ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo
CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19
Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by 
O.E.M., BIOS 4.6.3 03/06/2012
task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000
RIP: 0010:[81169ae8]  [81169ae8] anon_vma_clone+0x88/0xf5
RSP: 0018:8800baa3fda0  EFLAGS: 00010246
RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101
RDX:  RSI: 8800ba8b8380 RDI: 8800ba8b8388
RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000
R10: 000b R11: 00016338 R12: 
R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380
FS:  7f0b70511740() GS:88022bc8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0
Stack:
 880221ba92a0 88020dcfc980 8800bc609450 880221ba9228
 8800bc609450 880221ba9228 8800baa3fe08 81169b7b
 88020dcfc980 880211db5f80 880221ba9228 8800bc609450
Call Trace:
 [81169b7b] anon_vma_fork+0x26/0xec
 [810c0070] copy_process.part.28+0xf98/0x16ca
 [810c091d] do_fork+0xb4/0x211
 [810cbe9b] ? __set_task_blocked+0x5e/0x64
 [815a1b9f] ? _raw_spin_unlock_irq+0xc/0x1f
 [810c0ae0] SyS_clone+0x11/0x13
 [815a6a59] stub_clone+0x69/0x90
 [815a66fd] ? system_call_fastpath+0x1a/0x1f
Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 
62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff
RIP  [81169ae8] anon_vma_clone+0x88/0xf5
 RSP 8800baa3fda0
CR2: 003f9840b000

-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Crashes in 3.14.13+ related to anon_vma_clone.

2014-07-24 Thread Ben Greear

On 07/24/2014 04:08 PM, Ben Greear wrote:
 A few of our systems are repeatedly crashing when upgraded from
 the 3.14.6+ to 3.14.13+ kernels.  Both kernels have a fair bit
 of our out-of-tree patches, so could be our fault.

Ahh, looks like a bad merge of a stable patch with a slightly different
manually applied stable patch, causing double-free
in the wifi stack...

Sorry for the noise.

Thanks,
Ben

 
 But, in case this is a known problem...has anyone seen crashes like
 this below?
 
 Test case is using some ath9k APs and stations, but not certain
 that has much to do with the problem.
 
 BUG: unable to handle kernel paging request at 003f9840b000
 IP: [81169ae8] anon_vma_clone+0x88/0xf5
 PGD be580067 PUD bab85067 PMD babd5067 PTE 8000bebbd065
 Oops: 0003 [#1] PREEMPT SMP
 Modules linked in: 8021q garp stp mrp llc fuse macvlan wanlink(O) pktgen 
 ip6table_filter ip6_tables ebtable_nat ebtables coretemp hwmoo
 CPU: 1 PID: 1777 Comm: btserver Tainted: G C O 3.14.13+ #19
 Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by 
 O.E.M., BIOS 4.6.3 03/06/2012
 task: 88020dcfb100 ti: 8800baa3e000 task.ti: 8800baa3e000
 RIP: 0010:[81169ae8]  [81169ae8] anon_vma_clone+0x88/0xf5
 RSP: 0018:8800baa3fda0  EFLAGS: 00010246
 RAX: 8800ba8b8380 RBX: 003f9840b000 RCX: 0142b101
 RDX:  RSI: 8800ba8b8380 RDI: 8800ba8b8388
 RBP: 8800baa3fdd0 R08: 00015d60 R09: 003f9840b000
 R10: 000b R11: 00016338 R12: 
 R13: 8800bc609450 R14: 8802212db980 R15: 8800ba8b8380
 FS:  7f0b70511740() GS:88022bc8() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 003f9840b000 CR3: bab8f000 CR4: 07e0
 Stack:
  880221ba92a0 88020dcfc980 8800bc609450 880221ba9228
  8800bc609450 880221ba9228 8800baa3fe08 81169b7b
  88020dcfc980 880211db5f80 880221ba9228 8800bc609450
 Call Trace:
  [81169b7b] anon_vma_fork+0x26/0xec
  [810c0070] copy_process.part.28+0xf98/0x16ca
  [810c091d] do_fork+0xb4/0x211
  [810cbe9b] ? __set_task_blocked+0x5e/0x64
  [815a1b9f] ? _raw_spin_unlock_irq+0xc/0x1f
  [810c0ae0] SyS_clone+0x11/0x13
  [815a6a59] stub_clone+0x69/0x90
  [815a66fd] ? system_call_fastpath+0x1a/0x1f
 Code: 48 8b 3d 64 f1 b0 00 be d0 00 00 00 e8 9b 5e 01 00 48 85 c0 48 89 c3 74 
 62 45 31 e4 4d 8b 7e 08 4c 89 e7 49 8b 37 e8 fa ec ff ff
 RIP  [81169ae8] anon_vma_clone+0x88/0xf5
  RSP 8800baa3fda0
 CR2: 003f9840b000
 


-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ath10k: fix rssi reporting.

2014-05-15 Thread Ben Greear

Sorry, ...this and previous patch should not have gone to LKML.

Will send it over to ath10k list where it was supposed to go.

Thanks,
Ben

On 05/15/2014 11:31 AM, gree...@candelatech.com wrote:
> From: Ben Greear 
> 
> When the driver cannot provide proper rssi, mark
> status with RX_FLAG_NO_SIGNAL_VAL so that stack
> properly ignores it.
> 
> Signed-off-by: Ben Greear 
> ---
> 
> Patch is against my tree, hopefully will apply OK against upstream
> but have not tested that yet.
> 
>  drivers/net/wireless/ath/ath10k/htt_rx.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c 
> b/drivers/net/wireless/ath/ath10k/htt_rx.c
> index 0c83ffb..d8ec8dd 100644
> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
> @@ -1216,11 +1216,12 @@ static void ath10k_htt_rx_handler(struct ath10k_htt 
> *htt,
>   mpdu_ranges = htt_rx_ind_get_mpdu_ranges(rx);
>  
>   /* Fill this once, while this is per-ppdu */
> - if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_START_VALID) {
> - memset(rx_status, 0, sizeof(*rx_status));
> + memset(rx_status, 0, sizeof(*rx_status));
> + if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_START_VALID)
>   rx_status->signal  = ATH10K_DEFAULT_NOISE_FLOOR +
>rx->ppdu.combined_rssi;
> - }
> + else
> + rx_status->flag |= RX_FLAG_NO_SIGNAL_VAL;
>  
>   if (rx->ppdu.info0 & HTT_RX_INDICATION_INFO0_END_VALID) {
>   /* TSF available only in 32-bit */
> 


-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ath10k: fix rssi reporting.

2014-05-15 Thread Ben Greear

Sorry, ...this and previous patch should not have gone to LKML.

Will send it over to ath10k list where it was supposed to go.

Thanks,
Ben

On 05/15/2014 11:31 AM, gree...@candelatech.com wrote:
 From: Ben Greear gree...@candelatech.com
 
 When the driver cannot provide proper rssi, mark
 status with RX_FLAG_NO_SIGNAL_VAL so that stack
 properly ignores it.
 
 Signed-off-by: Ben Greear gree...@candelatech.com
 ---
 
 Patch is against my tree, hopefully will apply OK against upstream
 but have not tested that yet.
 
  drivers/net/wireless/ath/ath10k/htt_rx.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c 
 b/drivers/net/wireless/ath/ath10k/htt_rx.c
 index 0c83ffb..d8ec8dd 100644
 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
 +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
 @@ -1216,11 +1216,12 @@ static void ath10k_htt_rx_handler(struct ath10k_htt 
 *htt,
   mpdu_ranges = htt_rx_ind_get_mpdu_ranges(rx);
  
   /* Fill this once, while this is per-ppdu */
 - if (rx-ppdu.info0  HTT_RX_INDICATION_INFO0_START_VALID) {
 - memset(rx_status, 0, sizeof(*rx_status));
 + memset(rx_status, 0, sizeof(*rx_status));
 + if (rx-ppdu.info0  HTT_RX_INDICATION_INFO0_START_VALID)
   rx_status-signal  = ATH10K_DEFAULT_NOISE_FLOOR +
rx-ppdu.combined_rssi;
 - }
 + else
 + rx_status-flag |= RX_FLAG_NO_SIGNAL_VAL;
  
   if (rx-ppdu.info0  HTT_RX_INDICATION_INFO0_END_VALID) {
   /* TSF available only in 32-bit */
 


-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] netdev: pktgen xmit packet through vlan interface

2014-05-05 Thread Ben Greear

On 05/05/2014 08:12 AM, Jesper Dangaard Brouer wrote:
> On Fri, 02 May 2014 07:00:00 -0700
> John Fastabend  wrote:
> 
>> On 5/2/2014 6:19 AM, Jesper Dangaard Brouer wrote:
>>> On Fri,  2 May 2014 15:18:12 +0800
>>> Zhouyi Zhou  wrote:
>>>
>>>> As http://www.spinics.net/lists/netdev/msg165015.html
>>>> pktgen generates shared packet through vlan interface will cause
>>>> oops because of duplicate entering tc queue.
>>>>
>>>> Try to solve this problem by means of packet clone instead of sharing.
>>>
>>> I really don't like adding this stuff to the fast path of pktgen.
>>>
>>> Why would you use pktgen on a VLAN?
>>
>> Its a good way to test qdiscs. When you run pktgen over the VLAN
>> you exercise the lower devices qdisc.
> 
> I do (personally) need a faster way/tool to exercise the qdisc path.
> I'm currently using trafgen, but it is not fast enough for my 10G
> testing.
> 
> Perhaps we could add a pktgen option, that explicitly enable
> transmitting on qdisc path.  And when adding a VLAN device, auto enable
> that mode?

You could just force pktgen to not support multi-skb on vlan interfaces?

I thought we went through this a year or two ago and came up with
something like a 'pktgen-challenged' network interface flag?

Thanks,
Ben


-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] netdev: pktgen xmit packet through vlan interface

2014-05-05 Thread Ben Greear

On 05/05/2014 08:12 AM, Jesper Dangaard Brouer wrote:
 On Fri, 02 May 2014 07:00:00 -0700
 John Fastabend john.r.fastab...@intel.com wrote:
 
 On 5/2/2014 6:19 AM, Jesper Dangaard Brouer wrote:
 On Fri,  2 May 2014 15:18:12 +0800
 Zhouyi Zhou zhouzho...@gmail.com wrote:

 As http://www.spinics.net/lists/netdev/msg165015.html
 pktgen generates shared packet through vlan interface will cause
 oops because of duplicate entering tc queue.

 Try to solve this problem by means of packet clone instead of sharing.

 I really don't like adding this stuff to the fast path of pktgen.

 Why would you use pktgen on a VLAN?

 Its a good way to test qdiscs. When you run pktgen over the VLAN
 you exercise the lower devices qdisc.
 
 I do (personally) need a faster way/tool to exercise the qdisc path.
 I'm currently using trafgen, but it is not fast enough for my 10G
 testing.
 
 Perhaps we could add a pktgen option, that explicitly enable
 transmitting on qdisc path.  And when adding a VLAN device, auto enable
 that mode?

You could just force pktgen to not support multi-skb on vlan interfaces?

I thought we went through this a year or two ago and came up with
something like a 'pktgen-challenged' network interface flag?

Thanks,
Ben


-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Build failure related to 'spinlokk_types.h'

2014-02-26 Thread Ben Greear

On 02/26/2014 11:50 AM, Ben Greear wrote:
> This is from the ath10k tree, which was recently rebased on top of
> 3.14.0-rc4.
> 
> I'm getting the error below, but I cannot find any reference to 'spinlokk'
> in the source tree.  The build tree has two mentions, but these are 
> auto-generated
> from what I can tell:
> 
> [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk .
> ./arch/x86/kernel/cpu/.common.o.tmp:  
> /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
> ./arch/x86/kernel/cpu/.common.o.d: 
> /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
> [greearb@ben-dt2 linux-ath.x64]$

Looks like it was a .ccache issue...I removed it and re-built and it was
fine...

Thanks,
Ben


-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Build failure related to 'spinlokk_types.h'

2014-02-26 Thread Ben Greear

This is from the ath10k tree, which was recently rebased on top of
3.14.0-rc4.

I'm getting the error below, but I cannot find any reference to 'spinlokk'
in the source tree.  The build tree has two mentions, but these are 
auto-generated
from what I can tell:

[greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk .
./arch/x86/kernel/cpu/.common.o.tmp:  
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
./arch/x86/kernel/cpu/.common.o.d: 
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
[greearb@ben-dt2 linux-ath.x64]$

I tried make distclean, but that did not help.


Any idea where this might be coming from?

[greearb@ben-dt2 linux-ath.x64]$ make  bzImage modules
make[3]: Nothing to be done for `all'.
make[3]: Nothing to be done for `relocs'.
  CHK include/config/kernel.release
  Using /mnt/sda/home/greearb/git/linux.ath as source for kernel
  GEN /home/greearb/kernel/2.6/linux-ath.x64/Makefile
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CALL/mnt/sda/home/greearb/git/linux.ath/scripts/checksyscalls.sh
  CHK include/generated/compile.h
  CC  arch/x86/kernel/cpu/common.o
fixdep: error opening config file: 
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h: No such 
file or directory
make[5]: *** [arch/x86/kernel/cpu/common.o] Error 2
make[4]: *** [arch/x86/kernel/cpu] Error 2
make[3]: *** [arch/x86/kernel] Error 2
make[2]: *** [arch/x86] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2


Thanks,
Ben

-- 
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Build failure related to 'spinlokk_types.h'

2014-02-26 Thread Ben Greear

This is from the ath10k tree, which was recently rebased on top of
3.14.0-rc4.

I'm getting the error below, but I cannot find any reference to 'spinlokk'
in the source tree.  The build tree has two mentions, but these are 
auto-generated
from what I can tell:

[greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk .
./arch/x86/kernel/cpu/.common.o.tmp:  
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
./arch/x86/kernel/cpu/.common.o.d: 
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
[greearb@ben-dt2 linux-ath.x64]$

I tried make distclean, but that did not help.


Any idea where this might be coming from?

[greearb@ben-dt2 linux-ath.x64]$ make  bzImage modules
make[3]: Nothing to be done for `all'.
make[3]: Nothing to be done for `relocs'.
  CHK include/config/kernel.release
  Using /mnt/sda/home/greearb/git/linux.ath as source for kernel
  GEN /home/greearb/kernel/2.6/linux-ath.x64/Makefile
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CALL/mnt/sda/home/greearb/git/linux.ath/scripts/checksyscalls.sh
  CHK include/generated/compile.h
  CC  arch/x86/kernel/cpu/common.o
fixdep: error opening config file: 
/mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h: No such 
file or directory
make[5]: *** [arch/x86/kernel/cpu/common.o] Error 2
make[4]: *** [arch/x86/kernel/cpu] Error 2
make[3]: *** [arch/x86/kernel] Error 2
make[2]: *** [arch/x86] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2


Thanks,
Ben

-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Build failure related to 'spinlokk_types.h'

2014-02-26 Thread Ben Greear

On 02/26/2014 11:50 AM, Ben Greear wrote:
 This is from the ath10k tree, which was recently rebased on top of
 3.14.0-rc4.
 
 I'm getting the error below, but I cannot find any reference to 'spinlokk'
 in the source tree.  The build tree has two mentions, but these are 
 auto-generated
 from what I can tell:
 
 [greearb@ben-dt2 linux-ath.x64]$ grep -r spinlokk .
 ./arch/x86/kernel/cpu/.common.o.tmp:  
 /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
 ./arch/x86/kernel/cpu/.common.o.d: 
 /mnt/sda/home/greearb/git/linux.ath/include/linux/spinlokk_types.h \
 [greearb@ben-dt2 linux-ath.x64]$

Looks like it was a .ccache issue...I removed it and re-built and it was
fine...

Thanks,
Ben


-- 
Ben Greear gree...@candelatech.com
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/11] use ether_addr_equal_64bits

2013-12-31 Thread Ben Greear


On 12/31/2013 08:09 AM, Julia Lawall wrote:



On Tue, 31 Dec 2013, Ben Greear wrote:


On 12/30/2013 10:32 PM, Julia Lawall wrote:

I'm just thinking of a programmer, e.g. changing a struct like this:

   struct foo {
 u8 addr[ETH_ALEN];
-  u16 dummy;
   };


I don't know of a way to catch that.
Anyone else?


Well, one could have a semantic patch that checks for that.  But the
problem is that it is very slow, and it only covers the cases that I can
transform automatically, which currently means no pointers, only explicit
arrays.

On the other hand, I am finding the structure definition, so I can easily
update the structure definition with an appropriate comment.

struct foo {
  u8 addr[ETH_ALEN]; /* must be followed by two bytes in the structure */
  u16 dummy;
};

Unfortunately it is kind of verbose.  Could there be an attribute?  That
could even easily be checked.


Can you not just add a build-time macro to check that sizeof(foo) >= 8
for each of these struct foos?  Or, is it required that the dummy field
be there and be not used by anything else?


It doesn't matter what the field is used for.  The problem is that is it
necessary to ensure a property of the position of addr within the
structure.  It has to have at least 16 bytes after it.


You mean 16 bits?



But maybe something with sizeof(foo) and offset_of would do?

Could the macro be put near the declaration of the structure somehow?


I think that would work, but do not know all of the details of such
macros, so it's possible there is some catch.

If nothing else, then some run-time code that calculates the offset off
and asserts if it is broken in module initialization or similar might
be good enough.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 478 matches

Mail list logo