Re: [android-developers] Re: Suspicious TCP RST packets while device is sleeping.
I've been seeing the same behavior myself. Can someone address this please? Thanks Guy -- You received this message because you are subscribed to the Google Groups Android Developers group. To post to this group, send email to android-developers@googlegroups.com To unsubscribe from this group, send email to android-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/android-developers?hl=en
[android-developers] Re: Suspicious TCP RST packets while device is sleeping.
Well, I don't grok NAT enough to conclude that it's wrong. But I don't see why they'd do it -- unless they're trying to minimize traffic. Seems kinda trivial -- and likely more than offset by the later attempted transmit. I'm not sure what problem you're trying to solve. It can certainly happen that one side thinks a connection is open while the other thinks it's closed. The recipient sends a RST, the sender gets a connection reset and life goes on. Is it the delay in discovering the disconnect that's the issue? On Feb 2, 7:43 pm, Dan Sherman impact...@gmail.com wrote: Hey Bob, Thanks a lot for the response :) After a few more hours tonight working on the problem, I've got a bit more information to present. From everything I'm seeing, it looks like the issue has to do with NAT'ing at the network level (tmobile I'd imagine). The connection is definitely NAT'd, the client sees itself as one outgoing IP (14.130.xxx.xxx) and port, and the server sees an incoming connection from a different IP/port (208.54.xxx.xxx). My best guess is that tmobile is killing the connections at the NAT level after not seeing traffic running on it for a certain period of time (5 minutes in this case). This wouldn't be a problem, as you said, a reconnect works just fine. And in fact, the higher-level long-lived session control is already in place, and the client reconnects/etc properly when sensing a disconnect. The problem comes in based on _how_ the NAT is killing the connection. Keeping a wake-lock on device to prevent sleeping, and watching TCPdump on both sides shows the server receiving a RST packet, but no RST packet is sent to the client. The client sits there, assuming the connection is still active, indefinitely. The second it tries to do something (user-prompted, or via a ping timer), it sends a PSH packet to the server, and the server responds with a RST (it closed the connection when it got the RST from the NAT). Obviously if the NAT were to send RSTs both directions, this wouldn't be a problem, the client would notice the disconnect, and reconnect. But from everything I can tell, it notifies the server, and leaves the client completely unaware that the connection has been dropped... I understand that the NAT needs to clear out old/stale connections, but sending a RST uni-directionally seems a bit incorrect to me... Any ideas? - Dan On Tue, Feb 2, 2010 at 10:25 PM, Bob Kerns r...@acm.org wrote: This is expected behavior. TCP connections time out if the connection is lost, or either side dies. That way, you don't have systems drowning in dead connections. The RST packet is telling you that the server has forgotten about the connection. The client may even report it directly, if it realizes that it hasn't heard from the server, so you may get a connection reset error even without seeing an actual RST from the server. The default timeout is usually 5 minutes, which squares with your observations. In general, you should not try to solve your problem by increasing the timeout, but rather by reestablishing the connection, and maintaining long-lived sessions at a higher level. I'd recommend, if possible, dropping your AlarmManager ping task, in favor of reopening your connection. You'll consume less resources -- including battery. If you want to minimize the cost of reopening connections, you can send a ping whenever you happen to wake up, reopening if necessary. But that doesn't scale that well -- you'll be able to have more simultaneous clients if you strike a suitable balance between keeping connections alive, and the cost of reopening them. For rare interactions, you can support more clients if you open connections on actual need, and close them promptly when not needed. It all depends on exactly what you're trying to optimize, and the environment in which you're operating. The only constant is -- you can't DEPEND on keeping connections alive. View it as an optimization, rather than how your application works. And then make sure it is actually an optimization! So often, optimizations are a waste of a developer's time. I'd also recommend avoiding thinking about TCP at the level of packets (or segments), RST, etc., if at all possible. Unless you're trying to diagnose a flaky router, or issues with radio connectivity, or things at a similar level, it's better to focus at a higher level, at least at the socket level -- is it opening, established, closed, reset? On Feb 2, 1:05 am, Dan Sherman impact...@gmail.com wrote: Hey guys, trying to track down a rather elusive problem here... I've been playing around with long-standing TCP connections to a server. The client opens a TCP connection to the server, sets a timeout at a reasonably long period (30 minutes), and adds an AlarmManager task to ping the server every 15 (a ping is just a junk packet the server responds to with an
Re: [android-developers] Re: Suspicious TCP RST packets while device is sleeping.
Is it the delay in discovering the disconnect that's the issue? Exactly... The connection stays open to accept data from the server. There are definitely points in time when this wouldn't happen for a few minutes, and if the connection dropped, that wouldn't be a problem if the client noticed the disconnect immediately (it would just reconnect, and start waiting again). However, when the device sleeps, it doesn't see the disconnect until it wakes up (possibly hours later)... After a bit more research, it looks like if the client holds a wake lock infinitely (just for testing), it gets the reset packet immediately when the connection is killed, and re-connects immediately. However, if the device doesn't hold the lock, and goes to sleep, the reset packet is dropped somewhere... Anyone on the dev team able to explain that functionality (intended/unintended/workaround?) - Dan On Wed, Feb 3, 2010 at 3:30 AM, Bob Kerns r...@acm.org wrote: Well, I don't grok NAT enough to conclude that it's wrong. But I don't see why they'd do it -- unless they're trying to minimize traffic. Seems kinda trivial -- and likely more than offset by the later attempted transmit. I'm not sure what problem you're trying to solve. It can certainly happen that one side thinks a connection is open while the other thinks it's closed. The recipient sends a RST, the sender gets a connection reset and life goes on. Is it the delay in discovering the disconnect that's the issue? On Feb 2, 7:43 pm, Dan Sherman impact...@gmail.com wrote: Hey Bob, Thanks a lot for the response :) After a few more hours tonight working on the problem, I've got a bit more information to present. From everything I'm seeing, it looks like the issue has to do with NAT'ing at the network level (tmobile I'd imagine). The connection is definitely NAT'd, the client sees itself as one outgoing IP (14.130.xxx.xxx) and port, and the server sees an incoming connection from a different IP/port (208.54.xxx.xxx). My best guess is that tmobile is killing the connections at the NAT level after not seeing traffic running on it for a certain period of time (5 minutes in this case). This wouldn't be a problem, as you said, a reconnect works just fine. And in fact, the higher-level long-lived session control is already in place, and the client reconnects/etc properly when sensing a disconnect. The problem comes in based on _how_ the NAT is killing the connection. Keeping a wake-lock on device to prevent sleeping, and watching TCPdump on both sides shows the server receiving a RST packet, but no RST packet is sent to the client. The client sits there, assuming the connection is still active, indefinitely. The second it tries to do something (user-prompted, or via a ping timer), it sends a PSH packet to the server, and the server responds with a RST (it closed the connection when it got the RST from the NAT). Obviously if the NAT were to send RSTs both directions, this wouldn't be a problem, the client would notice the disconnect, and reconnect. But from everything I can tell, it notifies the server, and leaves the client completely unaware that the connection has been dropped... I understand that the NAT needs to clear out old/stale connections, but sending a RST uni-directionally seems a bit incorrect to me... Any ideas? - Dan On Tue, Feb 2, 2010 at 10:25 PM, Bob Kerns r...@acm.org wrote: This is expected behavior. TCP connections time out if the connection is lost, or either side dies. That way, you don't have systems drowning in dead connections. The RST packet is telling you that the server has forgotten about the connection. The client may even report it directly, if it realizes that it hasn't heard from the server, so you may get a connection reset error even without seeing an actual RST from the server. The default timeout is usually 5 minutes, which squares with your observations. In general, you should not try to solve your problem by increasing the timeout, but rather by reestablishing the connection, and maintaining long-lived sessions at a higher level. I'd recommend, if possible, dropping your AlarmManager ping task, in favor of reopening your connection. You'll consume less resources -- including battery. If you want to minimize the cost of reopening connections, you can send a ping whenever you happen to wake up, reopening if necessary. But that doesn't scale that well -- you'll be able to have more simultaneous clients if you strike a suitable balance between keeping connections alive, and the cost of reopening them. For rare interactions, you can support more clients if you open connections on actual need, and close them promptly when not needed. It all depends on exactly what you're trying to optimize, and the environment in which you're operating. The only constant
[android-developers] Re: Suspicious TCP RST packets while device is sleeping.
This is expected behavior. TCP connections time out if the connection is lost, or either side dies. That way, you don't have systems drowning in dead connections. The RST packet is telling you that the server has forgotten about the connection. The client may even report it directly, if it realizes that it hasn't heard from the server, so you may get a connection reset error even without seeing an actual RST from the server. The default timeout is usually 5 minutes, which squares with your observations. In general, you should not try to solve your problem by increasing the timeout, but rather by reestablishing the connection, and maintaining long-lived sessions at a higher level. I'd recommend, if possible, dropping your AlarmManager ping task, in favor of reopening your connection. You'll consume less resources -- including battery. If you want to minimize the cost of reopening connections, you can send a ping whenever you happen to wake up, reopening if necessary. But that doesn't scale that well -- you'll be able to have more simultaneous clients if you strike a suitable balance between keeping connections alive, and the cost of reopening them. For rare interactions, you can support more clients if you open connections on actual need, and close them promptly when not needed. It all depends on exactly what you're trying to optimize, and the environment in which you're operating. The only constant is -- you can't DEPEND on keeping connections alive. View it as an optimization, rather than how your application works. And then make sure it is actually an optimization! So often, optimizations are a waste of a developer's time. I'd also recommend avoiding thinking about TCP at the level of packets (or segments), RST, etc., if at all possible. Unless you're trying to diagnose a flaky router, or issues with radio connectivity, or things at a similar level, it's better to focus at a higher level, at least at the socket level -- is it opening, established, closed, reset? On Feb 2, 1:05 am, Dan Sherman impact...@gmail.com wrote: Hey guys, trying to track down a rather elusive problem here... I've been playing around with long-standing TCP connections to a server. The client opens a TCP connection to the server, sets a timeout at a reasonably long period (30 minutes), and adds an AlarmManager task to ping the server every 15 (a ping is just a junk packet the server responds to with an application-level ack). Nothing fancy, and everything works correctly on the emulator. The client stays connected to the server for as long as I've left it alone (a few hours easily). However, as soon as it runs on device, I receive some interesting behavior when the device is sleeping (CPU completely off if I understand correctly). If I let the device connect, and go to sleep (can't be 100% certain it is asleep, but I wait a good few minutes). And have the server send an un-expected packet to the client, the client most definitely wakes up, processes the packet, and sends a response. The wakeup noticibly takes a few extra seconds, but this isn't an issue. The issue comes in if I let the device sleep for a more extended period of time (somewhere around 5 minutes). At this time, I see the server drop the connection as reset, and the client sit there sleeping. As soon as the device is woken up (by my intervention), and I try to do any network actions, it notices the connection isn't good anymore, and starts a reconnect (hard-coded to reconnect). I've been running tcpdump on both the client, and the server. The interaction is as follows: Server's point of view: - Client connects (a few packets back and forth, application level, etc) - 5ish minutes pass (device is sleeping) - Client sends a reset packet (connection is torn down, expected) From the client's point of view: - Connection startup (a few packets back and forth, application level, etc) - Device goes to sleep The client never sees the TCP reset packet. Once woken by something external (me, the AlarmManager task, etc), the client immediately sees a RST packet from the server, tears down the connection, and starts over. Anyone care to chime in with ideas as to what is happening? My only thoughts are that someone in between is killing the connection due to not seeing any data send between the two after a certain amount of time, however the time between the last packet, and the RST isn't a consistent period... This behavior is happening when running a G1 on Tmobile's 3g US network. It happens when the server code is running both remotely (machine in Texas), as well as when its running on local machine (Florida). -- You received this message because you are subscribed to the Google Groups Android Developers group. To post to this group, send email to android-developers@googlegroups.com To unsubscribe from this group, send email to android-developers+unsubscr...@googlegroups.com For more options, visit this group at
Re: [android-developers] Re: Suspicious TCP RST packets while device is sleeping.
Hey Bob, Thanks a lot for the response :) After a few more hours tonight working on the problem, I've got a bit more information to present. From everything I'm seeing, it looks like the issue has to do with NAT'ing at the network level (tmobile I'd imagine). The connection is definitely NAT'd, the client sees itself as one outgoing IP (14.130.xxx.xxx) and port, and the server sees an incoming connection from a different IP/port (208.54.xxx.xxx). My best guess is that tmobile is killing the connections at the NAT level after not seeing traffic running on it for a certain period of time (5 minutes in this case). This wouldn't be a problem, as you said, a reconnect works just fine. And in fact, the higher-level long-lived session control is already in place, and the client reconnects/etc properly when sensing a disconnect. The problem comes in based on _how_ the NAT is killing the connection. Keeping a wake-lock on device to prevent sleeping, and watching TCPdump on both sides shows the server receiving a RST packet, but no RST packet is sent to the client. The client sits there, assuming the connection is still active, indefinitely. The second it tries to do something (user-prompted, or via a ping timer), it sends a PSH packet to the server, and the server responds with a RST (it closed the connection when it got the RST from the NAT). Obviously if the NAT were to send RSTs both directions, this wouldn't be a problem, the client would notice the disconnect, and reconnect. But from everything I can tell, it notifies the server, and leaves the client completely unaware that the connection has been dropped... I understand that the NAT needs to clear out old/stale connections, but sending a RST uni-directionally seems a bit incorrect to me... Any ideas? - Dan On Tue, Feb 2, 2010 at 10:25 PM, Bob Kerns r...@acm.org wrote: This is expected behavior. TCP connections time out if the connection is lost, or either side dies. That way, you don't have systems drowning in dead connections. The RST packet is telling you that the server has forgotten about the connection. The client may even report it directly, if it realizes that it hasn't heard from the server, so you may get a connection reset error even without seeing an actual RST from the server. The default timeout is usually 5 minutes, which squares with your observations. In general, you should not try to solve your problem by increasing the timeout, but rather by reestablishing the connection, and maintaining long-lived sessions at a higher level. I'd recommend, if possible, dropping your AlarmManager ping task, in favor of reopening your connection. You'll consume less resources -- including battery. If you want to minimize the cost of reopening connections, you can send a ping whenever you happen to wake up, reopening if necessary. But that doesn't scale that well -- you'll be able to have more simultaneous clients if you strike a suitable balance between keeping connections alive, and the cost of reopening them. For rare interactions, you can support more clients if you open connections on actual need, and close them promptly when not needed. It all depends on exactly what you're trying to optimize, and the environment in which you're operating. The only constant is -- you can't DEPEND on keeping connections alive. View it as an optimization, rather than how your application works. And then make sure it is actually an optimization! So often, optimizations are a waste of a developer's time. I'd also recommend avoiding thinking about TCP at the level of packets (or segments), RST, etc., if at all possible. Unless you're trying to diagnose a flaky router, or issues with radio connectivity, or things at a similar level, it's better to focus at a higher level, at least at the socket level -- is it opening, established, closed, reset? On Feb 2, 1:05 am, Dan Sherman impact...@gmail.com wrote: Hey guys, trying to track down a rather elusive problem here... I've been playing around with long-standing TCP connections to a server. The client opens a TCP connection to the server, sets a timeout at a reasonably long period (30 minutes), and adds an AlarmManager task to ping the server every 15 (a ping is just a junk packet the server responds to with an application-level ack). Nothing fancy, and everything works correctly on the emulator. The client stays connected to the server for as long as I've left it alone (a few hours easily). However, as soon as it runs on device, I receive some interesting behavior when the device is sleeping (CPU completely off if I understand correctly). If I let the device connect, and go to sleep (can't be 100% certain it is asleep, but I wait a good few minutes). And have the server send an un-expected packet to the client, the client most definitely wakes up, processes the packet, and sends a response. The wakeup noticibly takes a few