We've developed a fairly sophisticated android tablet application which 
relies on tablets communicating peer to peer over wifi. We're using 
straightforward java TCP/IP sockets to do this with our own application 
level handshaking and protocols, including reconnection when connections 
break etc. All this is working well on even the flakiest wifi so we have 
the product running in beta sites now.

However there is one problem that is eluding us so far. Under some 
circumstances (unknown, but they tend to occur in patches temporally) we 
are finding socket read or write times suddenly in the 10 second to 150 
second range when they are normally in the 100ms range. This can go on for 
many minutes, followed by equally mysteriously going away. While it is 
going on we'll also see a corresponding increase in turnround time for UDP 
packets (pings) addressed to the relevant tablets as if the android wifi 
has somehow gone to sleep. The app is still in the foreground and is 
running normally except for slow socket reads and writes which we can 
measure either side of the socket calls (the thread blocks of course).

We've eliminated threading/queue problems as the cause - we now have a 
basic two thread test program which just sends slightly lumpy generated 
traffic from one device to another and echoes it back. We're not sending 
large amounts of traffic - typically a few Kb every few minutes in a burst, 
and sometimes nothing for periods up to half an hour. Occasionally there is 
a file download but mostly at startup time. The slowness is unrelated to 
whether the connection has been recently used (apart from the expected 
'wakeup time' if the connection or device has been dormant for a while). 

Occasional slowness won't hurt us, but unfortunately once it starts it 
tends to be sticky for a time. Then it goes away without any intervention 
on our part. When it does manifest (it only happens on some sites) then it 
may happen between once and a few times a day. It may not happen for weeks.

One theory that we have is that android doesn't like long term open socket 
connections between devices. It would make sense that it might be optimised 
for short-lived connections like HTTP uses. We are trying a strategy where 
we close the connection if it is quiet and reopen on demand to see whether 
that makes a difference. It's a long test cycle though as we have to have a 
setup running for several days before we can realistically assess whether 
there's been any improvement as a result of a change. We have many other 
things on our list to try out but it's taking a long time because of the 
long cycle.

We are using a mix of Android 5 and 6 devices (mostly 6 now - whatever they 
upgrade to in the normal course), and a wide range of wifi routers 
(Netgear, Ubiquiti, OpenMesh, Technicolor to name a few). All exhibit the 
same problem, while we can have Windows machines merrily talking to each 
other over the very same wifi network at the same time without seeing any 
problems at all. The routers differ mostly in how often they just drop 
connections (whereupon we simply reconnect and carry on). We're mostly 
using Samsung kit as we've found that to be robust, performant and reliable.

Wireshark shows an increased number of resent packets when we have the two 
halves of the test program running on android and windows devices and the 
slowness manifests. It will usually afflict multiple android devices at 
about the same time i.e. if one is seeing it then often they all will be. 
But not necessarily all.

Top level question - has anybody else come across problems similar to this 
with android networking? We get the impression that not many other people 
are using android like this. We don't expect TCP/IP over wifi to be 
reliable in any sense especially out in the real world of cheap and 
cheerful wifi, but this particular problem hurts us as we need responsive 
communications for certain operations and the stickiness of the problem 
makes retry ineffective.

There are some architectural changes we could make to the design of our 
apps (more distributed, less client/server) but that's a big move just to 
get round a very specific problem.

-- 
You received this message because you are subscribed to the Google Groups 
"Android Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to android-developers+unsubscr...@googlegroups.com.
To post to this group, send email to android-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/android-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/android-developers/8620f014-27bb-48da-812a-40c80a51ea1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to