On 2016-01-29, Kim Zeitler <kim.zeit...@konzept-is.de> wrote: > On 01/29/16 15:00, Stuart Henderson wrote: > >> >> $ curl https://owncloud.XXXXXXXXXX/apps/files_pdfviewer/js/previewplugin.js >> curl: (7) Failed to connect to owncloud.XXXXXXXXXX port 443: Operation timed >> out >> >>> I have access to the logs and they show a mixture of 200 and 503 >> >> ...and that pretty much matches the pattern I've seen connecting by >> hand, so it's no big surprise that there are problems with the proxy >> too. > Glad that you could reproduce the problem, I was starting to doubt my > own abilities with a 'simple' proxy. > > >> >> If you have contact with any of the site admins see if they are >> running on linux with tcp_tw_recycle=1, I think there is a strong >> possibility that they are, and if so then they should fix their >> configuration. > I wrote to our contact there and am trying to get the information if > they are using this setting. >> >> They're likely to be breaking connections for NATted clients >> too (and this is only going to get worse as more ISPs start >> using CG-NAT for IPv4). The links in the above post have >> detailed explanations. >> >> OpenBSD uses this method which is described in RFC7323 sec 5.4 >> (OpenBSD's implementation predates this RFC by some years). >> >> o A random offset may be added to the timestamp clock on a per- >> connection basis. See [RFC6528], Section 3, on randomizing the >> initial sequence number (ISN). The same function with a different >> secret key can be used to generate the per-connection timestamp >> offset. >> >> There was a recent-ish change to the method used to generate the >> offsets (MD5 to SHA512), I wondered if that had changed anything >> so I've just checked from a 5.6 box, it does exactly the same - >> if I make repeated connections to the owncloud box, some of them >> fail. >> > Currently am not fully able to get my mind round the details in the > post, but if I read it correctly the machine running with tw_recycle has > problems associating connections correctly together because similar > host,port pairs but different timestamps. Shouldn't this cause problems > with all proxied or nated connections? Am simply asking as I somehow > can't fit it in that openbsd+squid shows this particular behaviour yet > {freebsd,debian}squid does not. > > Thanks Stuart so far for what you have found and the patience to explain > it to me.
Typical Linux behaviour (at least the version I tried) is to use a single counter for all TCP sessions from the host so it would be more likely to use 1,2,3 - 7,8,9 - 49,50,51 - 67,68,69. This isn't required by TCP though - that only needs timestamps *within a session* i.e src+dest host-port quad - to be increasing. Multiple sessions are treated separately and can be in any order wrt each other. If I understand correctly tw_recycle reduces it to just src+dest *host*. If you have two hosts with the simple behaviour (single counter) going through a NAT, it doesn't usually touch timestamps so they will be out of order - maybe 49,50,51 - 67,68,69 - 1,2,3 - 7,8,9. This is OK as far as TCP goes but breaks with tw_recycle. But in the NAT case it's usually only noticed if two people from behind the same NAT visit the site within the TIME_WAIT timeout window. For a proxy, there is a cutoff. There are two TCP sessions end-to-end, the packet data are copied across but not headers. The headers are subject to the proxy's OS's behaviour. Now... OpenBSD randomizes these per session. A random offset is applied and stored as part of the TCP state. This is good because it's extra entropy to help protect against blind spoofing, and avoids leaking information about the host's uptime. So simplified example you could have 4 consecutive sessions using 1,2,3 - 49,50,51 - 67,68,69 - 7,8,9 -- and that's ok. In spec for TCP, suggested by the newer RFC, and as you can see above, it's totally normal for a natted connection to act like this. It's just that Linux's tw_recycle misfeature gets confused. If you run the proxy on an OS which doesn't offset timestamps like this (note that OpenBSD has done this for many years), you won't trigger it, but run it on OpenBSD and it's easy. You'll also be able to trigger it by connecting from a single machine with a simple timestamp but running the connection through a PF nat with the "modulate timestamps" option. It can be worked around your side. But if you do that the server admins will likely never fix things (and maybe blame it on OpenBSD) so I'm reluctant to mention it on list - and that workaround will throttle tcp for all connections to/from the server, limiting you to about 5Mb max for transatlantic connections.