Dear Aleksandr,
Thank you for your reply. Just a note at the beginning, that your mailer
seems to wrap lines without preserving the right citation level. It’d be
great if you used a mailer following standards. (I know, it’s hard in a
corporate environment, but other Intel developers seem to have found
solutions for this.)
Am 15.10.24 um 11:45 schrieb Loktionov, Aleksandr:
-----Original Message-----
From: Paul Menzel <[email protected]>
Sent: Tuesday, October 15, 2024 10:24 AM
Thank you for your patch. For the summary I’d make it more about the
action of the patch like “Add intermediate filter to …”.
Sorry, I don't get your point. This patch fixes bug that can stop
vfs to receive any traffic making them useless. The first and most
visible effect of the bug is a lot of "Error I40E_AQ_RC_ENOSPC
adding RX filters on VF XX,..." errors from F/W complaining to add
MAC/VLAN filter. So I've mentioned it in the title to be easy found.
I don't add any filter in the driver, we have to add one more
intermediate state of the filter to avoid the race condition.
In my opinion, having the log in the body is good enough for search
engines and the summary should be optimized for `git log --oneline`
consumption. I am sorry about the confusion with my example. Maybe:
Add intermediate sync state to fix race condition
Am 15.10.24 um 09:04 schrieb Aleksandr Loktionov:
Fix a race condition in the i40e driver that leads to MAC/VLAN filters
becoming corrupted and leaking. Address the issue that occurs under
heavy load when multiple threads are concurrently modifying MAC/VLAN
filters by setting mac and port VLAN.
1. Thread T0 allocates a filter in i40e_add_filter() within
i40e_ndo_set_vf_port_vlan().
2. Thread T1 concurrently frees the filter in __i40e_del_filter() within
i40e_ndo_set_vf_mac().
3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which
refers to the already freed filter memory, causing corruption.
Reproduction steps:
1. Spawn multiple VFs.
2. Apply a concurrent heavy load by running parallel operations to change
MAC addresses on the VFs and change port VLANs on the host.
It’d be great if you shared your commands.
Sorry, I'm pretty sure it's quite impossible to reproduce the issue
with bash scripts /*I've tried really hard*/. Reproduction is
related to user-space/kernel code which might be not open-sourced.
So as I've explained in the commit title the race condition
possibility that was introduced from the very beginning.
Could you please ask to get clearance to publish it. My naive view
wonders, why legal(?) should oppose publication.
3. Observe errors in dmesg:
"Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX,
please set promiscuous on manually for VF XX".
I’d indent it by eight spaces and put it in one line.
Ok, I'll fix it in v2
The fix involves implementing a new intermediate filter state,
I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list.
These filters cannot be deleted from the hash list directly but
must be removed using the full process.
Please excuse my ignorance. Where is that done in the diff? For me it
looks like you only replace `I40E_FILTER_NEW` by `I40E_FILTER_NEW_SYNC`
in certain places, but no new condition for this case.
Here are below the code which adds new I40E_FILTER_NEW_SYNC enum.
And additional conditions for this I40E_FILTER_NEW_SYNC state. All
other places in the driver just tract I40E_FILTER_NEW_SYNC as not
just I40E_FILTER_NEW by default.
Thank you. For me it’s not so obvious from the diff, and indeed, it’s
done in `i40e_sync_vsi_filters()`. Thank you again.
Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as
key")
Signed-off-by: Aleksandr Loktionov <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e.h | 2 ++
drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 ++
drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++--
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h
b/drivers/net/ethernet/intel/i40e/i40e.h
index 2089a0e..a1842dd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -755,6 +755,8 @@ enum i40e_filter_state {
I40E_FILTER_ACTIVE, /* Added to switch by FW */
I40E_FILTER_FAILED, /* Rejected by FW */
I40E_FILTER_REMOVE, /* To be removed */
+ /* RESERVED */
Why the reserved comment? Please elaborate in the commit message.
This is for not breaking compatibility with different driver
versions. Between OOT, net-next and plain old net. Isn't it obvious
from the comment, it's "RESERVERD"?
Apparently not, otherwise I wouldn’t ask. ;-)
Can you provide me example commit message what I should follow?
There are people reading the code not familiar with the ecosystem, that
there is an out of tree driver fore example. So the code or the commit
message should have an explanation why `I40E_FILTER_NEW_SYNC = 6` and
what the reserved is used for.
+ I40E_FILTER_NEW_SYNC = 6, /* New, not sent, in sync task */
Also mention the hash list in the comment?
/* There is no 'removed' state; the filter struct is freed */
};
struct i40e_mac_filter {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index abf624d..1c439b1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -89,6 +89,8 @@ static char *i40e_filter_state_string[] = {
"ACTIVE",
"FAILED",
"REMOVE",
+ "<RESERVED>",
+ "NEW_SYNC",
};
/**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 25295ae..55fb362 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi *vsi)
hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
if (f->state == I40E_FILTER_NEW ||
+ f->state == I40E_FILTER_NEW_SYNC ||
f->state == I40E_FILTER_ACTIVE)
++cnt;
}
@@ -1441,6 +1442,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi
*vsi,
new->f = add_head;
new->state = add_head->state;
+ if (add_head->state == I40E_FILTER_NEW)
+ add_head->state = I40E_FILTER_NEW_SYNC;
/* Add the new filter to the tmp list */
hlist_add_head(&new->hlist, tmp_add_list);
@@ -1550,6 +1553,8 @@ static int i40e_correct_vf_mac_vlan_filters(struct
i40e_vsi *vsi,
return -ENOMEM;
new_mac->f = add_head;
new_mac->state = add_head->state;
+ if (add_head->state == I40E_FILTER_NEW)
+ add_head->state = I40E_FILTER_NEW_SYNC;
/* Add the new filter to the tmp list */
hlist_add_head(&new_mac->hlist, tmp_add_list);
@@ -2437,7 +2442,8 @@ static int
i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name,
struct i40e_mac_filter *f)
{
- bool enable = f->state == I40E_FILTER_NEW;
+ bool enable = f->state == I40E_FILTER_NEW ||
+ f->state == I40E_FILTER_NEW_SYNC;
struct i40e_hw *hw = &vsi->back->hw;
int aq_ret;
@@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
/* Add it to the hash list */
hlist_add_head(&new->hlist, &tmp_add_list);
+ f->state = I40E_FILTER_NEW_SYNC;
}
/* Count the number of active (current and new) VLAN
@@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
spin_lock_bh(&vsi->mac_filter_hash_lock);
hlist_for_each_entry_safe(new, h, &tmp_add_list, hlist) {
/* Only update the state if we're still NEW */
- if (new->f->state == I40E_FILTER_NEW)
+ if (new->f->state == I40E_FILTER_NEW ||
+ new->f->state == I40E_FILTER_NEW_SYNC)
new->f->state = new->state;
hlist_del(&new->hlist);
netdev_hw_addr_refcnt(new->f, vsi->netdev, -1);
Thank you again for your work and explanations.
Kind regards,
Paul