No, we only use HTTP Transport. For anything on the public Internet,
this is the only way to go ... it also gives you lots of extra
advantages like client firewall support, hardware load balancing, SSL
"for free", etc. When we were adopting Thrift three years ago, I did
some synthetic load tests to compare the overhead of THttpClient
transport versus direct binary transport. If the HTTP stack supports
proper HTTP Keep-Alive, the overhead was negligible (under 20%).
Unfortunately, several languages don't do proper keep-alive in their
HTTP libraries by default, so your mileage may vary drastically.
We mitigate against Thrift-related denial of services through a mix of
measures that should (hopefully) make a Thrift protocol attack less
fruitful than other attacks. (I.e. so that Thrift isn't the weakest link.)
For example, we use maxSkipDepth() to avoid bogus sequences of nested
structures:
http://svn.apache.org/viewvc/incubator/thrift/trunk/lib/java/src/org/apache/thrift/protocol/TProtocolUtil.java?revision=760189&view=co
And we determine the total incoming message length via the HTTP
Content-Length header to reject big messages before parsing, and use
this as a limit to TBinaryProtocol.setReadLength() to automatically
reject bogus object length/size fields:
http://svn.apache.org/viewvc/incubator/thrift/trunk/lib/java/src/org/apache/thrift/protocol/TBinaryProtocol.java?view=co
Our use of Thrift is obviously a bit unusual compared to most folks
using it for internal server-server communications, but we have millions
of distinct client machines talking Thrift to Evernote every month, so I
can vouch that it works.
On 6/11/10 9:37 AM, Bjørn Borud wrote:
On Fri, Jun 11, 2010 at 5:32 PM, Dave Engberg<[email protected]> wrote:
Evernote uses Thrift for all client-server communications, including
third-party API integrations (http://www.evernote.com/about/developer/api/).
We serialize messages up to 55MB via Thrift. This is very efficient on the
wire, but marshalling and unmarshalling objects can take a fair amount of
RAM due to various temporary buffers built into the networking and IO
runtime libraries.
do you use TFramedTransport? if so, I would assume that you have set the
frame size to 55Mb avoid the OOM error problems? I've been thinking a bit
about this lately since I may want to expose a Thrift API to the outside
world. Not setting a limit makes is exceptionally susceptible to
denial-of-service (just connect a socket and say "asdf" and boom). Setting
the limit too high would require about 5 minutes more hacking to create a
program that sucks up lots of resources on the server.
(I guess this problem is also why TFramedTransport avoids using
direct-allocated ByteBuffer?)
One improvement would be to have the ability to do sanity checks on frames
over a certain size -- so that connections writing bogus data can be killed
off early. But it isn't a quick fix and I am not entirely convinced that it
is worthwhile either.
-Bjørn