> I would prefer when you would call cancel_work_sync when metric stuff should
> be stopped. I was expecting to see this somewhere around
> batadv_v_elp_iface_disable after the cancel_work_sync but it seems like it is
> missing there (or in a similar place)
>

I tried this:

diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index 1d704574..b35ded79 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -387,8 +387,20 @@ int batadv_v_elp_iface_enable(struct
batadv_hard_iface *hard_iface)
  */
 void batadv_v_elp_iface_disable(struct batadv_hard_iface *hard_iface)
 {
+       struct batadv_hardif_neigh_node *hardif_neigh;
+
        cancel_delayed_work_sync(&hard_iface->bat_v.elp_wq);

+       rcu_read_lock();
+       hlist_for_each_entry_rcu(hardif_neigh,
+                                &hard_iface->neigh_list, list) {
+               if (!kref_get_unless_zero(&hardif_neigh->refcount))
+                       continue;
+               cancel_work_sync(&hardif_neigh->bat_v.metric_work);
+               batadv_hardif_neigh_put(hardif_neigh);
+       }
+       rcu_read_unlock();
+
        dev_kfree_skb(hard_iface->bat_v.elp_skb);
        hard_iface->bat_v.elp_skb = NULL;
 }

But it seems to cause a hang on reboot every once in a while. When the hang
happens, I'm not able to trigger sysrq over serial.

I can try kgdb, but that requires sysrq to work, so I'm not sure how
I can gain control after the machine becomes unresponsive.

I'm not sure why there isn't a watchdog bite when this happens.

Andy

Reply via email to