[
https://issues.apache.org/jira/browse/THRIFT-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jens Geyer resolved THRIFT-3805.
--------------------------------
Resolution: Fixed
Assignee: Michael Scott Leuthaeuser
Fix Version/s: 0.10.0
Committed.
> Golang server susceptible to memory spike from malformed message
> ----------------------------------------------------------------
>
> Key: THRIFT-3805
> URL: https://issues.apache.org/jira/browse/THRIFT-3805
> Project: Thrift
> Issue Type: Bug
> Components: Go - Library
> Affects Versions: 0.9.3
> Environment: OSX 10.10.5, Debian 7.10
> Reporter: Michael Scott Leuthaeuser
> Assignee: Michael Scott Leuthaeuser
> Priority: Critical
> Fix For: 0.10.0
>
> Attachments: thrift_golang_memory_spike_fix.patch
>
>
> When a thrift server is started through the Go library, using the default
> TServerSocket, the processor from the generated files, and a SimpleServer2
> (which uses BinaryProtocol), it listens on the created socket for incoming
> messages. However, if the incoming message is *not* from another Thrift
> server, and thus the data submitted over the connection does not conform to
> the Thrift message serialization format (for us, this was triggered by
> automated security scanners connecting to the exposed port and sending it
> data to see how it would react), it triggers a massive memory spike.
> This appears to happen because the BinaryProtocol expects the first 4 bytes
> of the file to indicate the size of the following message, interpreted as a
> signed Int32 in BigEndian format. If this value is positive, it reads that
> many bytes from the connection.
> However, to do this, it first *allocates* a slice of that many bytes. If the
> provided message is not a thrift serialization, this signed Int32 can be
> massive (we're typically seeing it have a value of 1-2 billion), causing it
> to allocate *hundreds* of megabytes of data. This slice is retained until
> the expected number of bytes are received (which generally doesn't happen for
> a malformed connection), or until the connection is closed from the client
> end. When either happen, the BinaryProtocol then type-casts the entire array
> to a String type, which causes a *second* allocation of equal size, before
> returning the message to be processed.
> Since this processing fails, both the byte slice and the string are discarded
> and reclaimed on the next garbage collection cycle, but in the mean type, it
> will have allocated an amount of memory equal to double the value of that
> Int32, in bytes. We're typically seeing this allocation exceed 800-1000 MB
> per variable (so 800 MB when the connection is received, and another 800-1000
> MB when it's closed). The first allocation will persist as long as the
> connection is maintained, and other connections that are similarly malformed
> will cause concurrent allocation spikes, which can easily crash the server.
> To demonstrate this, one can make an extremely simple single-method thrift
> contract, start a server, then netcat to the host port and feed it at least 4
> characters of data.
> To fix this, we implemented an iterative buffered read into the
> BinaryProtocol itself. Instead of allocating the entire array at once, if
> the message indicates it is larger than a certain size (we have it set at
> 32kb, but the actual value is largely arbitrary), it instead reads the
> message in increments of that size, and joins them using the standard library
> bytes.Buffer type. Here's a diff of our changes:
> https://gist.github.com/kaedys/53b8fd05690d5d8f202afa7878d6e3d5
> This fixes the issue of having a massive initial allocation due to a
> malformed request, while still allowing messages of arbitrary incoming size.
> It will likely marginally slow down the reading of extremely large messages
> (ones substantially larger than the constant defined), but the bytes.Buffer
> type is well optimized for growing itself as it is written to. I also
> realize that this only occurs with malformed requests to the server, but I
> would think a server package should be relatively hardened against malformed
> data being submitted to it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)