Will Pierce created THRIFT-1737:
-----------------------------------
Summary: UDP socket support for python
Key: THRIFT-1737
URL: https://issues.apache.org/jira/browse/THRIFT-1737
Project: Thrift
Issue Type: New Feature
Components: Python - Library
Reporter: Will Pierce
This patch adds support for UDP socket servers and clients in python. This
reduces overhead and network latency due to TCP handshaking, _especially_ for
"oneway" service methods.
One useful feature of a UDP service is that the clients don't need to rebuild
their connection to the server when a UDP packet is lost, so the "blast radius"
of the timeout exception is limited to a single service call, not the entire
"connection". Also, framing is not necessary because UDP packets have length
encoded in their header.
This transport is not suitable for large messages because UDP is inherently
limited to 64 KB packet lengths, and often much smaller (500 - 1500 bytes)
depending on intermediate links and whether UDP fragments are reassembled.
Avoid large query/response payloads with this transport.
h2. Implementation
UDP support is implemented by subclassing TSocket and TServerSocket into
TUDPSocket and TServerUDPSocket, and adding a TDatagramTransport. The server's
accept() method actually receives an entire inbound request packet. An inbound
packet is wrapped as a stream with StringIO, and the response "connection"
records the sender's host+port so responses are delivered from the server's
socket back to the client.
The TDatagramTransport converts the EOFError raised after reaching the end of
the packet into a TTransport exception, to accomodate TServers.
h2. Testing:
The unit tests now have a TestUDP.py script which runs a UDP server and client,
and exercises several of the ThriftTest service calls, and verifies that
responses match expectations. It ensures that "oneway" method calls are truly
non-blocking, 1 packet "send and forget". It also forces a timeout in the
middle of a sequence of blocking RPC calls, which confirms that a timeout only
breaks a single RPC, not the entire client.
I haven't used this with server types other than TThreadedServer, or in a big
environment yet. There may be edge-cases that haven't surfaced yet.
Tested with IPv4 and IPv6 on localhost and python2.7 (dev box is fedora17).
h2. Minor bugfix:
The python RunClientServer.py test script had a 1-line bug where it ran some
other test scripts twice by mistake (probably a cut and paste error).
h2. General warnings for posterity:
* UDP packets are *easily*spoofed*!
** don't use this on public-internet facing interfaces
** spoofed client IP attacks may turn your server into an attack vector
* UDP is not reliable
** clients will have to handle socket.timeout exceptions for every RPC call
** UDP may be _more_ unreliable during network congestion
* No retries.
** this library doesn't do any retries
** there's only one timeout setting per client, which applies to every method
call
** but the timeout may be changed with the existing .setTimeout(msec) call
* Compression
** I haven't tested using TZlibTransport wrapping this to compress the packets,
but it ought to work (unless there are bugs)
h2. Tuning to avoid Timeouts:
Linux hosts tend to have small default values for the kernel's memory buffers
used to queue up UDP packets. When that buffer fills up with packets that the
server process hasn't yet processed, then the kernel drops the packet, even
though it's been fully decoded and pulled off the NIC.
This would show up as lots of "socket.timeout" exceptions raised in client
code, and no sign of an inbound method call at the server.
If you run "netstat -s" and see increasing "packet receive errors" in the *Udp*
section of output, that is strong evidence that you need to increase your
hosts' receive buffers.
As root, you can raise the UDP buffer receive (and send) space to 4MB with:
{noformat}
sysctl -w net.core.rmem_default=4194304
sysctl -w net.core.wmem_default=4194304
{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira