[ https://issues.apache.org/jira/browse/HADOOP-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daryn Sharp updated HADOOP-10940: --------------------------------- Attachment: HADOOP-10940.patch The problem causes invalid rpc responses to cause the client to go OOM. This is killing oozie servers when users try to use a 2.x client to 0.23. The same applies for 2.x to 1.x. Added a {{IpcStreams}} object to manage the rpc encoding/decoding. Response size must be > 0 and < data data length used by the rpc server. Request decoding is simpler and more efficient. If the first response has length -1, then it's assumed to be a pre-rpcv9 error response. Pre-rpcv9 responses began with the callId, not a length, and the callId for error was -1. This patch also fixes flushing issues. Namely the multiple-send before reading a response. This occurs in two cases: # insecure: connection header+context+call # secure: connection header+sasl negotiate W/o the fix to control flushing, unit tests to verify invalid rpc version always failed with broken pipe. When the server reads the connection header for an incompatible client, it sends an error response and immediately closes the socket. The client may still be in the process of sending multiple messages as listed above and cause a broken pipe. I believe the flushing issue may also solve the sporadic unit tests failing under windows about the remote end closing the connection. > RPC client does no bounds checking of responses > ----------------------------------------------- > > Key: HADOOP-10940 > URL: https://issues.apache.org/jira/browse/HADOOP-10940 > Project: Hadoop Common > Issue Type: Bug > Components: ipc > Affects Versions: 2.0.0-alpha, 3.0.0 > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Priority: Critical > Attachments: HADOOP-10940.patch > > > The rpc client does no bounds checking of server responses. In the case of > communicating with an older and incompatible RPC, this may lead to OOM issues > and leaking of resources. -- This message was sent by Atlassian JIRA (v6.2#6252)