Re: in6_leavegroup work queue
Alexander, I spent quite some time working on this problem and I found some interesting information, see below. On 31/10/13(Thu) 17:20, Alexander Bluhm wrote: On Thu, Oct 31, 2013 at 09:56:11AM +0100, Martin Pieuchot wrote: On 30/10/13(Wed) 16:48, Alexander Bluhm wrote: [...] I'm not sure to get the whole picture about this ioctl called in an interrupt context so feel free to correct me, but the only offending function is nd6_timer(), is that right? No, this code is also called from softnet interrupt context. An autoconfigured IPv6 address can deleted by a timeout or by a router advertisement. nd6_ra_input() - prelist_update() - nd6_prelist_add() - purge_detached() - in6_purgeaddr() - in6_leavegroup() By looking at the history of nd6_rtr.c, rev 1.54 tell us that this problem of adding/deleting a prefix or address in interrupt context is not new. Unfortunately the hack introduced at that time only works for a subset of prelist_update(): creating an addresses. But the story of prelist_update() does not end here. Since rev 1.55 was introduced to work around another problem caused by the fact that prelist_update() can run multiple times before the task scheduled to create an address has run. So I think that we should modify how/when prelist_update() is executed, In theory we can ignore the incomming packet, we may always loose packets and hope for a retransmit. But that would mean to avoid the sleep and do a rollback. That would be a horrible diff. I'm not sure to understand the whole picture once again, but can't we defer the (whole) processing of a router advertisement message in a task? Finally since we are moving away from workq(9) to task(9) I'd suggest you to use the latter. That is easy, but it does not solve the problems you describe. I have to call task_set() in in6_leavegroup() to set the if_index. May be calling task_set() can be moved to somewhere else but that does not look easy. I am still not happy with this diff although I think it does not cause memory corruption. Look at this call path: if_detach() - in6_ifdetach() - in6_purgeaddr() - in6_leavegroup() As I delay the SIOCDELMULTI after the interface is destroyed, the ioctl is never called. That doesn't matter ;) If the interface is gone there's no hardware filter to update. Perhaps we look at the problem from the wrong direction. Why do some network drivers sleep when changing multicast groups? Did they always do that or was something changed? Some driver sleep because that's how their author/porter wrote them since they are allowed to sleep in ioctls. Most (all) of the USB drivers are written this way, since synchronous transfer submission makes the caller sleep until the transaction is processed. But our problem here is that the nd6 code *abuse* the ioctl interface in interrupt context. So we could take another direction and design a new interface that does not allow to sleep when updating the multicast filters of an interface. But I believe it would requires much more changes and since this whole prelist_update() looks like a whole hack to me, I would prefer if someone could clean it. Martin
Re: in6_leavegroup work queue
On Thu, Oct 31, 2013 at 09:56:11AM +0100, Martin Pieuchot wrote: On 30/10/13(Wed) 16:48, Alexander Bluhm wrote: Now I use the if_index to detect that the interface is gone. Do you know if the memory pointed by the imm pointer you're passing to your workq can be freed before the task got scheduled, if the interface is removed/destroyed for example? I am pretty sure that this works. The old in6_leavegroup() code called free on it, so it must not be referenced anywhere else. All imm pointer come from the ia6_memberships list, it is emptied while calling in6_leavegroup(). There's also a risk of calling in6_leavegroup() a second time before the first task get scheduled, in this case the second task will have a bogus imm pointer. This problem can be avoided by using a task(9) though. in6_leavegroup() is only called once per imm as it is the in6_multi_mship free function. Note that struct in6_multi has a reference counter, so imm-i6mm_maddr should be valid memory. Also in6_delmulti() itself calls splsoftnet() so there never was protection against timeout or softnet interrupts calling in6_leavegroup() again. I'm not sure to get the whole picture about this ioctl called in an interrupt context so feel free to correct me, but the only offending function is nd6_timer(), is that right? No, this code is also called from softnet interrupt context. An autoconfigured IPv6 address can deleted by a timeout or by a router advertisement. nd6_ra_input() - prelist_update() - nd6_prelist_add() - purge_detached() - in6_purgeaddr() - in6_leavegroup() In theory we can ignore the incomming packet, we may always loose packets and hope for a retransmit. But that would mean to avoid the sleep and do a rollback. That would be a horrible diff. Finally since we are moving away from workq(9) to task(9) I'd suggest you to use the latter. That is easy, but it does not solve the problems you describe. I have to call task_set() in in6_leavegroup() to set the if_index. May be calling task_set() can be moved to somewhere else but that does not look easy. I am still not happy with this diff although I think it does not cause memory corruption. Look at this call path: if_detach() - in6_ifdetach() - in6_purgeaddr() - in6_leavegroup() As I delay the SIOCDELMULTI after the interface is destroyed, the ioctl is never called. Perhaps we look at the problem from the wrong direction. Why do some network drivers sleep when changing multicast groups? Did they always do that or was something changed? bluhm Index: netinet6/in6.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v retrieving revision 1.122 diff -u -p -r1.122 in6.c --- netinet6/in6.c 24 Oct 2013 11:20:18 - 1.122 +++ netinet6/in6.c 31 Oct 2013 14:40:05 - @@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int); void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *); void in6_ifloop_request(int, struct ifaddr *); +void in6_leavegroup_task(void *, void *); const struct sockaddr_in6 sa6_any = { sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0 @@ -1777,16 +1778,18 @@ in6_delmulti(struct in6_multi *in6m) ifafree(in6m-in6m_ia-ia_ifa); /* release reference */ } - /* -* Notify the network driver to update its multicast -* reception filter. -*/ - bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6)); - ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6); - ifr.ifr_addr.sin6_family = AF_INET6; - ifr.ifr_addr.sin6_addr = in6m-in6m_addr; - (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp, - SIOCDELMULTI, (caddr_t)ifr); + if (in6m-in6m_ifp != NULL) { + /* +* Notify the network driver to update its multicast +* reception filter. +*/ + bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6)); + ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6); + ifr.ifr_addr.sin6_family = AF_INET6; + ifr.ifr_addr.sin6_addr = in6m-in6m_addr; + (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp, + SIOCDELMULTI, (caddr_t)ifr); + } free(in6m, M_IPMADDR); } splx(s); @@ -1814,11 +1817,27 @@ in6_joingroup(struct ifnet *ifp, struct int in6_leavegroup(struct in6_multi_mship *imm) { + if (imm-i6mm_maddr) { + task_set(imm-i6mm_task, in6_leavegroup_task, imm, + (void *)(unsigned long)imm-i6mm_maddr-in6m_ifp-if_index); + task_add(systq, imm-i6mm_task); + } else + free(imm,
Re: in6_leavegroup work queue
On Fri, Oct 18, 2013 at 01:00:25PM +0200, Martin Pieuchot wrote: On 18/10/13(Fri) 12:45, Alexander Bluhm wrote: Ethernet drivers connected via USB might sleep when their multicast group filter is modified. Unfortunately this happens from softclock or softnet interrupt when IPv6 decides to unconfigure its addresses automatically. An obvious solution is to use a work queue. I have put the workq storage into struct in6_multi_mship. That requires to include sys/workq.h before compiling this struct. One problem with a work queue is that you cannot guarantee the interface will still be here when the task will be scheduled, so passing a pointer might not be a good idea. That is why I wanted to change the if_get() API to use unique index for each interface... But maybe there's another solution at this problem than using a workq... Now I use the if_index to detect that the interface is gone. I this a better approach? bluhm Index: netinet6/in6.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v retrieving revision 1.122 diff -u -p -u -p -r1.122 in6.c --- netinet6/in6.c 24 Oct 2013 11:20:18 - 1.122 +++ netinet6/in6.c 30 Oct 2013 15:01:20 - @@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int); void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *); void in6_ifloop_request(int, struct ifaddr *); +void in6_leavegroup_task(void *, void *); const struct sockaddr_in6 sa6_any = { sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0 @@ -1777,16 +1778,18 @@ in6_delmulti(struct in6_multi *in6m) ifafree(in6m-in6m_ia-ia_ifa); /* release reference */ } - /* -* Notify the network driver to update its multicast -* reception filter. -*/ - bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6)); - ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6); - ifr.ifr_addr.sin6_family = AF_INET6; - ifr.ifr_addr.sin6_addr = in6m-in6m_addr; - (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp, - SIOCDELMULTI, (caddr_t)ifr); + if (in6m-in6m_ifp != NULL) { + /* +* Notify the network driver to update its multicast +* reception filter. +*/ + bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6)); + ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6); + ifr.ifr_addr.sin6_family = AF_INET6; + ifr.ifr_addr.sin6_addr = in6m-in6m_addr; + (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp, + SIOCDELMULTI, (caddr_t)ifr); + } free(in6m, M_IPMADDR); } splx(s); @@ -1814,11 +1817,26 @@ in6_joingroup(struct ifnet *ifp, struct int in6_leavegroup(struct in6_multi_mship *imm) { - if (imm-i6mm_maddr) - in6_delmulti(imm-i6mm_maddr); - free(imm, M_IPMADDR); + workq_queue_task(NULL, imm-wqt, 0, in6_leavegroup_task, imm, + (void *)(unsigned long)imm-i6mm_maddr-in6m_ifp-if_index); + else + free(imm, M_IPMADDR); return 0; +} + +void +in6_leavegroup_task(void *arg1, void *arg2) +{ + struct in6_multi_mship *imm = arg1; + unsigned int index = (unsigned long)arg2; + + /* If interface has been be freed, avoid call to if_ioctl(). */ + if (if_get(index) != imm-i6mm_maddr-in6m_ifp) + imm-i6mm_maddr-in6m_ifp = NULL; + + in6_delmulti(imm-i6mm_maddr); + free(imm, M_IPMADDR); } /* Index: netinet6/in6_var.h === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_var.h,v retrieving revision 1.44 diff -u -p -u -p -r1.44 in6_var.h --- netinet6/in6_var.h 24 Oct 2013 11:31:43 - 1.44 +++ netinet6/in6_var.h 30 Oct 2013 15:01:20 - @@ -64,6 +64,8 @@ #ifndef _NETINET6_IN6_VAR_H_ #define _NETINET6_IN6_VAR_H_ +#include sys/workq.h + /* * Interface address, Internet version. One of these structures * is allocated for each interface with an Internet address. @@ -480,6 +482,7 @@ do { \ * belongs to. */ struct in6_multi_mship { + struct workq_task wqt; /* Allow network driver to sleep */ struct in6_multi *i6mm_maddr; /* Multicast address pointer */ LIST_ENTRY(in6_multi_mship) i6mm_chain; /* multicast options chain */ };
in6_leavegroup work queue
Hi, Ethernet drivers connected via USB might sleep when their multicast group filter is modified. Unfortunately this happens from softclock or softnet interrupt when IPv6 decides to unconfigure its addresses automatically. An obvious solution is to use a work queue. I have put the workq storage into struct in6_multi_mship. That requires to include sys/workq.h before compiling this struct. Is it a good idea to include sys/workq.h directly form netinet6/in6_var.h? An alternative is to include sys/workq.h from 50 kernel .c files. What is the right way? Note that netinet6/in6_var.h is included from user space and even from some ports. bluhm Index: netinet6/in6.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v retrieving revision 1.120 diff -u -p -u -p -r1.120 in6.c --- netinet6/in6.c 17 Oct 2013 16:27:45 - 1.120 +++ netinet6/in6.c 18 Oct 2013 10:08:27 - @@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int); void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *); void in6_ifloop_request(int, struct ifaddr *); +void in6_leavegroup_task(void *, void *); const struct sockaddr_in6 sa6_any = { sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0 @@ -1816,9 +1817,20 @@ in6_leavegroup(struct in6_multi_mship *i { if (imm-i6mm_maddr) - in6_delmulti(imm-i6mm_maddr); - free(imm, M_IPMADDR); + workq_queue_task(NULL, imm-wqt, 0, in6_leavegroup_task, imm, + NULL); + else + free(imm, M_IPMADDR); return 0; +} + +void +in6_leavegroup_task(void *arg1, void *arg2) +{ + struct in6_multi_mship *imm = arg1; + + in6_delmulti(imm-i6mm_maddr); + free(imm, M_IPMADDR); } /* Index: netinet6/in6_var.h === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_var.h,v retrieving revision 1.42 diff -u -p -u -p -r1.42 in6_var.h --- netinet6/in6_var.h 14 Oct 2013 11:07:42 - 1.42 +++ netinet6/in6_var.h 18 Oct 2013 10:09:52 - @@ -64,6 +64,8 @@ #ifndef _NETINET6_IN6_VAR_H_ #define _NETINET6_IN6_VAR_H_ +#include sys/workq.h + /* * Interface address, Internet version. One of these structures * is allocated for each interface with an Internet address. @@ -486,6 +488,7 @@ do { \ * belongs to. */ struct in6_multi_mship { + struct workq_task wqt; /* Allow network driver to sleep */ struct in6_multi *i6mm_maddr; /* Multicast address pointer */ LIST_ENTRY(in6_multi_mship) i6mm_chain; /* multicast options chain */ };
Re: in6_leavegroup work queue
Hi Alexander, On 18/10/13(Fri) 12:45, Alexander Bluhm wrote: Hi, Ethernet drivers connected via USB might sleep when their multicast group filter is modified. Unfortunately this happens from softclock or softnet interrupt when IPv6 decides to unconfigure its addresses automatically. An obvious solution is to use a work queue. I have put the workq storage into struct in6_multi_mship. That requires to include sys/workq.h before compiling this struct. One problem with a work queue is that you cannot guarantee the interface will still be here when the task will be scheduled, so passing a pointer might not be a good idea. That is why I wanted to change the if_get() API to use unique index for each interface... But maybe there's another solution at this problem than using a workq... Martin