No, we only use HTTP Transport. For anything on the public Internet, this is the only way to go ... it also gives you lots of extra advantages like client firewall support, hardware load balancing, SSL "for free", etc. When we were adopting Thrift three years ago, I did some synthetic load tests to compare the overhead of THttpClient transport versus direct binary transport. If the HTTP stack supports proper HTTP Keep-Alive, the overhead was negligible (under 20%). Unfortunately, several languages don't do proper keep-alive in their HTTP libraries by default, so your mileage may vary drastically.

We mitigate against Thrift-related denial of services through a mix of measures that should (hopefully) make a Thrift protocol attack less fruitful than other attacks. (I.e. so that Thrift isn't the weakest link.) For example, we use maxSkipDepth() to avoid bogus sequences of nested structures:
http://svn.apache.org/viewvc/incubator/thrift/trunk/lib/java/src/org/apache/thrift/protocol/TProtocolUtil.java?revision=760189&view=co
And we determine the total incoming message length via the HTTP Content-Length header to reject big messages before parsing, and use this as a limit to TBinaryProtocol.setReadLength() to automatically reject bogus object length/size fields:
http://svn.apache.org/viewvc/incubator/thrift/trunk/lib/java/src/org/apache/thrift/protocol/TBinaryProtocol.java?view=co

Our use of Thrift is obviously a bit unusual compared to most folks using it for internal server-server communications, but we have millions of distinct client machines talking Thrift to Evernote every month, so I can vouch that it works.


On 6/11/10 9:37 AM, Bjørn Borud wrote:
On Fri, Jun 11, 2010 at 5:32 PM, Dave Engberg<[email protected]>  wrote:

Evernote uses Thrift for all client-server communications, including
third-party API integrations (http://www.evernote.com/about/developer/api/).
  We serialize messages up to 55MB via Thrift.  This is very efficient on the
wire, but marshalling and unmarshalling objects can take a fair amount of
RAM due to various temporary buffers built into the networking and IO
runtime libraries.

do you use TFramedTransport?  if so, I would assume that you have set the
frame size to 55Mb avoid the OOM error problems?  I've been thinking a bit
about this lately since I may want to expose a Thrift API to the outside
world.  Not setting a limit makes is exceptionally susceptible to
denial-of-service (just connect a socket and say "asdf" and boom).  Setting
the limit too high would require about 5 minutes more hacking to create a
program that sucks up lots of resources on the server.

(I guess this problem is also why TFramedTransport avoids using
direct-allocated ByteBuffer?)

One improvement would be to have the ability to do sanity checks on frames
over a certain size -- so that connections writing bogus data can be killed
off early.  But it isn't a quick fix and I am not entirely convinced that it
is worthwhile either.

-Bjørn

Reply via email to