On 31/05/2025 11:56, Sven Eckelmann wrote:
On Monday, 19 May 2025 22:46:31 CEST Matthias Schiffer wrote:
  struct batadv_hard_iface *
-batadv_hardif_get_by_netdev(const struct net_device *net_dev)
+batadv_hardif_get_by_netdev(struct net_device *net_dev)
  {
         struct batadv_hard_iface *hard_iface;
+       struct net_device *mesh_iface;
- rcu_read_lock();
-       list_for_each_entry_rcu(hard_iface, &batadv_hardif_list, list) {
-               if (hard_iface->net_dev == net_dev &&
-                   kref_get_unless_zero(&hard_iface->refcount))
-                       goto out;
-       }
+       mesh_iface = netdev_master_upper_dev_get(net_dev);
+       if (!mesh_iface || !batadv_meshif_is_valid(mesh_iface))
+               return NULL;
- hard_iface = NULL;
+       hard_iface = netdev_lower_dev_get_private(mesh_iface, net_dev);
+       if (!kref_get_unless_zero(&hard_iface->refcount))
+               return NULL;
-out:
-       rcu_read_unlock();
         return hard_iface;
  }

This code is now relying on rtnl_lock() (see `ASSERT_RTNL` in
`netdev_master_upper_dev_get` and most likely some comments somwhere about the
lists used by `netdev_lower_dev_get_private`). But `batadv_tt_local_add` is
using this function without holding this lock all the time. For example during
packet processing.

See for example `batadv_tt_local_add` calls in `batadv_interface_tx`. This
will happen when `skb->skb_iif` is not 0 (so it was forwarded).


Please double check this - I have not actually tested it but just went through
the code.


And saying this, the `batadv_hardif_get_by_netdev` call was also used to
retrieve additional information about alll kind of interfaces - even when they
are not used by batman-adv directly. For example for figuring out if it is a
wifi interface(for the TT wifi flag). With you change here, you are basically
breaking this functionality because you now require that the netdev is a lower
interface of batman-adv. Therefore, things like:


                    ┌──────┐
        ┌───────────┼br-lan├──────┐
        │           └──────┘      │
        │                         │
        │                         │
      ┌─▼─┐                    ┌──▼─┐
      │ap0│                    │bat0│
      └───┘                    └──┬─┘
                                  │
                                  │
                               ┌──▼──┐
                               │mesh0│
                               └─────┘
Is not handled anymore correctly in TT because ap0 is not a lower interface of
any batadv mesh interface. And as result, the ap-isolation feature of TT
will break.

Kind regards,
        Sven

Hmm, this is a tricky one. Only having the hardifs around while they're used for meshing means we need some other way to determine the wifi flags - but doing it on demand for every batadv_tt_local_add() seems like it could be used to facilitate a DoS on the RTNL by causing large numbers of TT entries to be added, as the lock needs to be held for resolving the iflink.

One option might be to add a cache for the wifi flag (and possible other information, I'll have to check if there is anything else), but store it in the mesh interface, only for interfaces that are bridged with the mesh. Cache entries could be created on demand when a local TT entry is added for an unknown IIF; when to remove cache entries is something I'll have to figure out.

Simpler ideas how to solve this are welcome :)

Best,
Matthias

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to