Re: in6_leavegroup work queue

2013-11-05 Thread Martin Pieuchot
Alexander,

I spent quite some time working on this problem and I found some
interesting information, see below.

On 31/10/13(Thu) 17:20, Alexander Bluhm wrote:
 On Thu, Oct 31, 2013 at 09:56:11AM +0100, Martin Pieuchot wrote:
  On 30/10/13(Wed) 16:48, Alexander Bluhm wrote:
  [...]
 
  I'm not sure to get the whole picture about this ioctl called in an
  interrupt context so feel free to correct me, but the only offending
  function is nd6_timer(), is that right?
 
 No, this code is also called from softnet interrupt context.  An
 autoconfigured IPv6 address can deleted by a timeout or by a router
 advertisement.
 
 nd6_ra_input() - prelist_update() - nd6_prelist_add() -
 purge_detached() - in6_purgeaddr() - in6_leavegroup()

By looking at the history of nd6_rtr.c, rev 1.54 tell us that this
problem of adding/deleting a prefix or address in interrupt context
is not new. 

Unfortunately the hack introduced at that time only works for a subset
of prelist_update(): creating an addresses.

But the story of prelist_update() does not end here. Since rev 1.55 was
introduced to work around another problem caused by the fact that
prelist_update() can run multiple times before the task scheduled to
create an address has run.

So I think that we should modify how/when prelist_update() is executed,

 In theory we can ignore the incomming packet, we may always loose
 packets and hope for a retransmit.  But that would mean to avoid
 the sleep and do a rollback.  That would be a horrible diff.

I'm not sure to understand the whole picture once again, but can't
we defer the (whole) processing of a router advertisement message in
a task?

  Finally since we are moving away from workq(9) to task(9) I'd suggest
  you to use the latter.
 
 That is easy, but it does not solve the problems you describe.  I
 have to call task_set() in in6_leavegroup() to set the if_index.
 May be calling task_set() can be moved to somewhere else but that
 does not look easy.
 
 I am still not happy with this diff although I think it does not
 cause memory corruption.  Look at this call path:
 
 if_detach() - in6_ifdetach() - in6_purgeaddr() - in6_leavegroup()
 
 As I delay the SIOCDELMULTI after the interface is destroyed,
 the ioctl is never called.

That doesn't matter ;) If the interface is gone there's no hardware
filter to update.

 Perhaps we look at the problem from the wrong direction.  Why do
 some network drivers sleep when changing multicast groups?  Did
 they always do that or was something changed?

Some driver sleep because that's how their author/porter wrote them
since they are allowed to sleep in ioctls. Most (all) of the USB drivers
are written this way, since synchronous transfer submission makes the
caller sleep until the transaction is processed.

But our problem here is that the nd6 code *abuse* the ioctl interface in
interrupt context.

So we could take another direction and design a new interface that does
not allow to sleep when updating the multicast filters of an interface.
But I believe it would requires much more changes and since this whole
prelist_update() looks like a whole hack to me, I would prefer if
someone could clean it.

Martin



Re: in6_leavegroup work queue

2013-10-31 Thread Alexander Bluhm
On Thu, Oct 31, 2013 at 09:56:11AM +0100, Martin Pieuchot wrote:
 On 30/10/13(Wed) 16:48, Alexander Bluhm wrote:
  Now I use the if_index to detect that the interface is gone.
 
 Do you know if the memory pointed by the imm pointer you're passing to
 your workq can be freed before the task got scheduled, if the interface
 is removed/destroyed for example?

I am pretty sure that this works.  The old in6_leavegroup() code
called free on it, so it must not be referenced anywhere else.  All
imm pointer come from the ia6_memberships list, it is emptied while
calling in6_leavegroup().

 There's also a risk of calling in6_leavegroup() a second time before the
 first task get scheduled, in this case the second task will have a bogus
 imm pointer.  This problem can be avoided by using a task(9) though. 

in6_leavegroup() is only called once per imm as it is the in6_multi_mship
free function.

Note that struct in6_multi has a reference counter, so imm-i6mm_maddr
should be valid memory.  Also in6_delmulti() itself calls splsoftnet()
so there never was protection against timeout or softnet interrupts
calling in6_leavegroup() again.

 I'm not sure to get the whole picture about this ioctl called in an
 interrupt context so feel free to correct me, but the only offending
 function is nd6_timer(), is that right?

No, this code is also called from softnet interrupt context.  An
autoconfigured IPv6 address can deleted by a timeout or by a router
advertisement.

nd6_ra_input() - prelist_update() - nd6_prelist_add() -
purge_detached() - in6_purgeaddr() - in6_leavegroup()

In theory we can ignore the incomming packet, we may always loose
packets and hope for a retransmit.  But that would mean to avoid
the sleep and do a rollback.  That would be a horrible diff.

 Finally since we are moving away from workq(9) to task(9) I'd suggest
 you to use the latter.

That is easy, but it does not solve the problems you describe.  I
have to call task_set() in in6_leavegroup() to set the if_index.
May be calling task_set() can be moved to somewhere else but that
does not look easy.

I am still not happy with this diff although I think it does not
cause memory corruption.  Look at this call path:

if_detach() - in6_ifdetach() - in6_purgeaddr() - in6_leavegroup()

As I delay the SIOCDELMULTI after the interface is destroyed,
the ioctl is never called.

Perhaps we look at the problem from the wrong direction.  Why do
some network drivers sleep when changing multicast groups?  Did
they always do that or was something changed?

bluhm

Index: netinet6/in6.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v
retrieving revision 1.122
diff -u -p -r1.122 in6.c
--- netinet6/in6.c  24 Oct 2013 11:20:18 -  1.122
+++ netinet6/in6.c  31 Oct 2013 14:40:05 -
@@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u
 int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int);
 void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *);
 void in6_ifloop_request(int, struct ifaddr *);
+void in6_leavegroup_task(void *, void *);
 
 const struct sockaddr_in6 sa6_any = {
sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0
@@ -1777,16 +1778,18 @@ in6_delmulti(struct in6_multi *in6m)
ifafree(in6m-in6m_ia-ia_ifa); /* release reference */
}
 
-   /*
-* Notify the network driver to update its multicast
-* reception filter.
-*/
-   bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6));
-   ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6);
-   ifr.ifr_addr.sin6_family = AF_INET6;
-   ifr.ifr_addr.sin6_addr = in6m-in6m_addr;
-   (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp,
-   SIOCDELMULTI, (caddr_t)ifr);
+   if (in6m-in6m_ifp != NULL) {
+   /*
+* Notify the network driver to update its multicast
+* reception filter.
+*/
+   bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6));
+   ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6);
+   ifr.ifr_addr.sin6_family = AF_INET6;
+   ifr.ifr_addr.sin6_addr = in6m-in6m_addr;
+   (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp,
+   SIOCDELMULTI, (caddr_t)ifr);
+   }
free(in6m, M_IPMADDR);
}
splx(s);
@@ -1814,11 +1817,27 @@ in6_joingroup(struct ifnet *ifp, struct 
 int
 in6_leavegroup(struct in6_multi_mship *imm)
 {
+   if (imm-i6mm_maddr) {
+   task_set(imm-i6mm_task, in6_leavegroup_task, imm,
+   (void *)(unsigned long)imm-i6mm_maddr-in6m_ifp-if_index);
+   task_add(systq, imm-i6mm_task);
+   } else
+   free(imm,  

Re: in6_leavegroup work queue

2013-10-30 Thread Alexander Bluhm
On Fri, Oct 18, 2013 at 01:00:25PM +0200, Martin Pieuchot wrote:
 On 18/10/13(Fri) 12:45, Alexander Bluhm wrote:
  
  Ethernet drivers connected via USB might sleep when their multicast
  group filter is modified.  Unfortunately this happens from softclock
  or softnet interrupt when IPv6 decides to unconfigure its addresses
  automatically.
  
  An obvious solution is to use a work queue.  I have put the workq
  storage into struct in6_multi_mship.  That requires to include
  sys/workq.h before compiling this struct.
 
 One problem with a work queue is that you cannot guarantee the interface
 will still be here when the task will be scheduled, so passing a pointer
 might not be a good idea.
 
 That is why I wanted to change the if_get() API to use unique index for
 each interface...
 
 But maybe there's another solution at this problem than using a workq...

Now I use the if_index to detect that the interface is gone.

I this a better approach?

bluhm

Index: netinet6/in6.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v
retrieving revision 1.122
diff -u -p -u -p -r1.122 in6.c
--- netinet6/in6.c  24 Oct 2013 11:20:18 -  1.122
+++ netinet6/in6.c  30 Oct 2013 15:01:20 -
@@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u
 int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int);
 void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *);
 void in6_ifloop_request(int, struct ifaddr *);
+void in6_leavegroup_task(void *, void *);
 
 const struct sockaddr_in6 sa6_any = {
sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0
@@ -1777,16 +1778,18 @@ in6_delmulti(struct in6_multi *in6m)
ifafree(in6m-in6m_ia-ia_ifa); /* release reference */
}
 
-   /*
-* Notify the network driver to update its multicast
-* reception filter.
-*/
-   bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6));
-   ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6);
-   ifr.ifr_addr.sin6_family = AF_INET6;
-   ifr.ifr_addr.sin6_addr = in6m-in6m_addr;
-   (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp,
-   SIOCDELMULTI, (caddr_t)ifr);
+   if (in6m-in6m_ifp != NULL) {
+   /*
+* Notify the network driver to update its multicast
+* reception filter.
+*/
+   bzero(ifr.ifr_addr, sizeof(struct sockaddr_in6));
+   ifr.ifr_addr.sin6_len = sizeof(struct sockaddr_in6);
+   ifr.ifr_addr.sin6_family = AF_INET6;
+   ifr.ifr_addr.sin6_addr = in6m-in6m_addr;
+   (*in6m-in6m_ifp-if_ioctl)(in6m-in6m_ifp,
+   SIOCDELMULTI, (caddr_t)ifr);
+   }
free(in6m, M_IPMADDR);
}
splx(s);
@@ -1814,11 +1817,26 @@ in6_joingroup(struct ifnet *ifp, struct 
 int
 in6_leavegroup(struct in6_multi_mship *imm)
 {
-
if (imm-i6mm_maddr)
-   in6_delmulti(imm-i6mm_maddr);
-   free(imm,  M_IPMADDR);
+   workq_queue_task(NULL, imm-wqt, 0, in6_leavegroup_task, imm,
+   (void *)(unsigned long)imm-i6mm_maddr-in6m_ifp-if_index);
+   else 
+   free(imm,  M_IPMADDR);
return 0;
+}
+
+void
+in6_leavegroup_task(void *arg1, void *arg2)
+{
+   struct in6_multi_mship  *imm = arg1;
+   unsigned int index = (unsigned long)arg2;
+
+   /* If interface has been be freed, avoid call to if_ioctl(). */
+   if (if_get(index) != imm-i6mm_maddr-in6m_ifp)
+   imm-i6mm_maddr-in6m_ifp = NULL;
+
+   in6_delmulti(imm-i6mm_maddr);
+   free(imm,  M_IPMADDR);
 }
 
 /*
Index: netinet6/in6_var.h
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_var.h,v
retrieving revision 1.44
diff -u -p -u -p -r1.44 in6_var.h
--- netinet6/in6_var.h  24 Oct 2013 11:31:43 -  1.44
+++ netinet6/in6_var.h  30 Oct 2013 15:01:20 -
@@ -64,6 +64,8 @@
 #ifndef _NETINET6_IN6_VAR_H_
 #define _NETINET6_IN6_VAR_H_
 
+#include sys/workq.h
+
 /*
  * Interface address, Internet version.  One of these structures
  * is allocated for each interface with an Internet address.
@@ -480,6 +482,7 @@ do {
\
  * belongs to.
  */
 struct in6_multi_mship {
+   struct  workq_task wqt; /* Allow network driver to sleep */
struct  in6_multi *i6mm_maddr;  /* Multicast address pointer */
LIST_ENTRY(in6_multi_mship) i6mm_chain;  /* multicast options chain */
 };



in6_leavegroup work queue

2013-10-18 Thread Alexander Bluhm
Hi,

Ethernet drivers connected via USB might sleep when their multicast
group filter is modified.  Unfortunately this happens from softclock
or softnet interrupt when IPv6 decides to unconfigure its addresses
automatically.

An obvious solution is to use a work queue.  I have put the workq
storage into struct in6_multi_mship.  That requires to include
sys/workq.h before compiling this struct.

Is it a good idea to include sys/workq.h directly form netinet6/in6_var.h?
An alternative is to include sys/workq.h from 50 kernel .c files.

What is the right way?

Note that netinet6/in6_var.h is included from user space and even
from some ports.

bluhm

Index: netinet6/in6.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6.c,v
retrieving revision 1.120
diff -u -p -u -p -r1.120 in6.c
--- netinet6/in6.c  17 Oct 2013 16:27:45 -  1.120
+++ netinet6/in6.c  18 Oct 2013 10:08:27 -
@@ -124,6 +124,7 @@ int in6_lifaddr_ioctl(struct socket *, u
 int in6_ifinit(struct ifnet *, struct in6_ifaddr *, int);
 void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *);
 void in6_ifloop_request(int, struct ifaddr *);
+void in6_leavegroup_task(void *, void *);
 
 const struct sockaddr_in6 sa6_any = {
sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0
@@ -1816,9 +1817,20 @@ in6_leavegroup(struct in6_multi_mship *i
 {
 
if (imm-i6mm_maddr)
-   in6_delmulti(imm-i6mm_maddr);
-   free(imm,  M_IPMADDR);
+   workq_queue_task(NULL, imm-wqt, 0, in6_leavegroup_task, imm,
+   NULL);
+   else 
+   free(imm,  M_IPMADDR);
return 0;
+}
+
+void
+in6_leavegroup_task(void *arg1, void *arg2)
+{
+   struct in6_multi_mship *imm = arg1;
+
+   in6_delmulti(imm-i6mm_maddr);
+   free(imm,  M_IPMADDR);
 }
 
 /*
Index: netinet6/in6_var.h
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_var.h,v
retrieving revision 1.42
diff -u -p -u -p -r1.42 in6_var.h
--- netinet6/in6_var.h  14 Oct 2013 11:07:42 -  1.42
+++ netinet6/in6_var.h  18 Oct 2013 10:09:52 -
@@ -64,6 +64,8 @@
 #ifndef _NETINET6_IN6_VAR_H_
 #define _NETINET6_IN6_VAR_H_
 
+#include sys/workq.h
+
 /*
  * Interface address, Internet version.  One of these structures
  * is allocated for each interface with an Internet address.
@@ -486,6 +488,7 @@ do {
\
  * belongs to.
  */
 struct in6_multi_mship {
+   struct  workq_task wqt; /* Allow network driver to sleep */
struct  in6_multi *i6mm_maddr;  /* Multicast address pointer */
LIST_ENTRY(in6_multi_mship) i6mm_chain;  /* multicast options chain */
 };



Re: in6_leavegroup work queue

2013-10-18 Thread Martin Pieuchot
Hi Alexander,

On 18/10/13(Fri) 12:45, Alexander Bluhm wrote:
 Hi,
 
 Ethernet drivers connected via USB might sleep when their multicast
 group filter is modified.  Unfortunately this happens from softclock
 or softnet interrupt when IPv6 decides to unconfigure its addresses
 automatically.
 
 An obvious solution is to use a work queue.  I have put the workq
 storage into struct in6_multi_mship.  That requires to include
 sys/workq.h before compiling this struct.

One problem with a work queue is that you cannot guarantee the interface
will still be here when the task will be scheduled, so passing a pointer
might not be a good idea.

That is why I wanted to change the if_get() API to use unique index for
each interface...

But maybe there's another solution at this problem than using a workq...

Martin