On Wed, May 31, 2017 at 10:11:45AM +0200, [email protected] wrote:
> s = socket(res->ai_family,
> res->ai_socktype,res->ai_protocol);
> if (s == -1)
> errx(1,"socket");
> int enable = 1;
> if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &enable,
> sizeof(int)) < 0)
> errx(1, "setsockopt");
> if (connect(s, res->ai_addr, res->ai_addrlen) == -1)
> errx(1,"connect");
I have discussed this with mpi@ and found a sollution. Using
SO_REUSEADDR with TCP connect is not a smart thing in general and
our kernel is missing a check. The 4 tupel of src/dst ip/port must
be unique for TCP. The in_pcbhashlookup() in in_pcbconnect()
enforces this. But if the socket is not bound, the automatic bind
by connect happens after the check. As the socket has the SO_REUSEADDR
flag, in_pcbbind() may select an existing local port. Then we have
two colliding TCP stacks. As they have different sequence numbers,
the stack sends ACK packets to show that the numbers are wrong.
This creates a packet storm on loopback. The softnet task is
constantly holding the netlock and has a high priority, so nothing
else runs. pf would drop the wrong packets as the sequence number
does not match, but default rule set has "set skip on lo".
I think the best fix is to do the in_pcbhashlookup() again after
in_pcbbind(). This creates sporadic "connect: Address already in
use" errors instead of a hang. An alternative would be to make
in_pcbbind() aware of the connect address and search for an unused
bind port smarter. But that would be a lot of complexity for a
stupid use case. An other option would be the forbid SO_REUSEADDR
with TCP, but there are cases where this is legal. Using the same
source port when connecting to different servers is allowed.
oantoine, could you test whether this diff fixes your setup?
bluhm
Index: netinet/in_pcb.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/in_pcb.c,v
retrieving revision 1.222
diff -u -p -r1.222 in_pcb.c
--- netinet/in_pcb.c 9 Jun 2017 12:56:43 -0000 1.222
+++ netinet/in_pcb.c 2 Aug 2017 15:45:21 -0000
@@ -533,15 +533,23 @@ in_pcbconnect(struct inpcb *inp, struct
return (error);
if (in_pcbhashlookup(inp->inp_table, sin->sin_addr, sin->sin_port,
- *ina, inp->inp_lport, inp->inp_rtableid) != 0)
+ *ina, inp->inp_lport, inp->inp_rtableid) != NULL)
return (EADDRINUSE);
KASSERT(inp->inp_laddr.s_addr == INADDR_ANY || inp->inp_lport);
if (inp->inp_laddr.s_addr == INADDR_ANY) {
- if (inp->inp_lport == 0 &&
- in_pcbbind(inp, NULL, curproc) == EADDRNOTAVAIL)
- return (EADDRNOTAVAIL);
+ if (inp->inp_lport == 0) {
+ error = in_pcbbind(inp, NULL, curproc);
+ if (error)
+ return (error);
+ if (in_pcbhashlookup(inp->inp_table, sin->sin_addr,
+ sin->sin_port, *ina, inp->inp_lport,
+ inp->inp_rtableid) != NULL) {
+ inp->inp_lport = 0;
+ return (EADDRINUSE);
+ }
+ }
inp->inp_laddr = *ina;
}
inp->inp_faddr = sin->sin_addr;
Index: netinet6/in6_pcb.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/in6_pcb.c,v
retrieving revision 1.98
diff -u -p -r1.98 in6_pcb.c
--- netinet6/in6_pcb.c 13 May 2017 17:42:55 -0000 1.98
+++ netinet6/in6_pcb.c 2 Aug 2017 15:48:22 -0000
@@ -284,16 +284,24 @@ in6_pcbconnect(struct inpcb *inp, struct
if (in6_pcbhashlookup(inp->inp_table, &sin6->sin6_addr, sin6->sin6_port,
IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6) ? in6a : &inp->inp_laddr6,
- inp->inp_lport, inp->inp_rtableid)) {
+ inp->inp_lport, inp->inp_rtableid) != NULL) {
return (EADDRINUSE);
}
KASSERT(IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6) || inp->inp_lport);
if (IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6)) {
- if (inp->inp_lport == 0 &&
- in_pcbbind(inp, NULL, curproc) == EADDRNOTAVAIL)
- return (EADDRNOTAVAIL);
+ if (inp->inp_lport == 0) {
+ error = in_pcbbind(inp, NULL, curproc);
+ if (error)
+ return (error);
+ if (in6_pcbhashlookup(inp->inp_table, &sin6->sin6_addr,
+ sin6->sin6_port, in6a, inp->inp_lport,
+ inp->inp_rtableid) != NULL) {
+ inp->inp_lport = 0;
+ return (EADDRINUSE);
+ }
+ }
inp->inp_laddr6 = *in6a;
}
inp->inp_faddr6 = sin6->sin6_addr;