Re: stale connections, keepalive?

2016-08-29 Thread Greg Troxel

Tomasz Sterna  writes:

> See io.keepalive [1][2] options.
> Setting this up will flush single whitespace character over the wire
> when the connection dangs idle. This triggers the TCP layer connection
> validation.

I set this up as:
  check every 300
  close connections idle for 86400 (1d)
  send keepaliave after 14400 (4h)

I realized I was seeing the problem on my other server too, just less
(because I usually don't use my phone to it).  I'll see if this resolves
the issue.  (My phone getting a wakeup every 4h is going to be zero
compared to what happens in between, I think.)


signature.asc
Description: PGP signature


Re: stale connections, keepalive?

2016-08-29 Thread Tomasz Sterna
W dniu 29.08.2016, pon o godzinie 12∶41 -0400, użytkownik Greg Troxel
napisał:
> dropping idle connections from its NAT table without
> > telling anyone, so later when mobile network closed a connection it
> > silently dropped RST packets not knowing who to NAT them to. [...]
> Are you saying that a cell provider tracks TCP state and when the
> data connection is lost sends RST packets for open connections?

I am blissfully oblivious to inner workings of wide area switching
networks, but it sure looked like so when I was investigating the
dangling connections issue.

And a quick look at PDP_context[1] gives impression that it has
specific knowledge of the established connections.

[1] https://en.wikipedia.org/wiki/GPRS_core_network#PDP_context

-- 
 /o__ 
(_<^'  All generalisations are dangerous, including this one.

signature.asc
Description: This is a digitally signed message part


Re: stale connections, keepalive?

2016-08-29 Thread Greg Troxel

Tomasz Sterna  writes:

> jabberd2 has support for application layer keepalives.
>
> See io.keepalive [1][2] options.
> Setting this up will flush single whitespace character over the wire
> when the connection dangs idle. This triggers the TCP layer connection
> validation.

Great - thanks very much for that pointer.

> Having said that, I am running my server without both application layer
> and TCP keepalives turned on and see no issues with dangling
> connections.

I run two jabberd2 servers.  One of them is not behind a buggy firewall
and does not have dangling connections.  However, I'm not sure anyone is
using a phone.

> But.. I had them a lot, when my server was behind a buggy Cisco router
> doing NAT. It was dropping idle connections from its NAT table without
> telling anyone, so later when mobile network closed a connection it
> silently dropped RST packets not knowing who to NAT them to. This was
> causing a lot of dangling connections on my server.

This is probably more or less the issue, and if so I can't fix it.

Are you saying that a cell provider tracks TCP state and when the data
connection is lost sends RST packets for open connections?   I hadn't
realized that, but it seems obviously sensible if a little bit of a
layer violation.


signature.asc
Description: PGP signature


Re: stale connections, keepalive?

2016-08-29 Thread Greg Troxel

Christof Meerwald  writes:

> On Sun, Aug 28, 2016 at 02:13:34PM +0200, Tomasz Sterna wrote:
>> W dniu 27.08.2016, sob o godzinie 14∶55 -0400, użytkownik Greg Troxel
>> napisał:
>> >   should jabberd2 force TCP keepalive on?
>> I'm not sure whether it is possible.
>> At least on Linux it is a system-wide setting and requires root to
>> change.
>
> Are you sure? There appear to be some socket options that can be set
> for each socket:
>
> http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/#setsockopt

I'm not running Linux, so we're talking more or less about what POSIX
specifies for the BSD sockets interface.  But, that page describes in
the SO_KEEPALIVE case exactly the traditional BSD socket option for
keepalive, which I suspect dates from about 4.2BSD but my memory of the
late 80s is now a bit fuzzy.

So yes, I meant to have a way to enable keepalive via SO_KEEPALIVE on
all sockets.  But that's not really the right thing.

Tomasz's point about Linux and system-wide setting is probably about
what the default value is if a program doesn't ask for keepalives. OS X
has a sysctl for this.  NetBSD doesn't; it's up to the program, as it
was historically.

On the system in question, it is surely behind a buggy firewall.
However, that's beyond my control.  It's interesting that this doesn't
show up, because I would expect the mobile to lose the data connection
and fail to close the TCP connection fairly often.

Arguably I have a system-wide problem, not a jabber problem.   But
still, given that clients just vanish, it seems like there should be
some mechanism for connections to get cleaned up.


I will check out the application-level keepalive.   What I think I want
is that for a connection from a client, if there has been no traffic in
or out for 24h, to send a space.  That will break the stale connections
after a day, and it should not cause any additional traffic on real
connections.  For now I can just send a keepalive every 24h, and that's
close enough.


signature.asc
Description: PGP signature


Re: stale connections, keepalive?

2016-08-29 Thread Tomasz Sterna
W dniu 28.08.2016, nie o godzinie 22∶45 +0200, użytkownik Christof
Meerwald napisał:

> > I'm not sure [...]
> 
> Are you sure? [...]

😄



-- 
 /o__ 
(_<^' One good turn deserves another.