Phil D'Amore created FLUME-2798:
-----------------------------------
Summary: Malformed Syslog messages can lead to OutOfMemoryException
Key: FLUME-2798
URL: https://issues.apache.org/jira/browse/FLUME-2798
Project: Flume
Issue Type: Bug
Components: Sinks+Sources
Affects Versions: v1.6.0, v1.5.0, v1.4.0
Reporter: Phil D'Amore
Priority: Critical
It's possible for a client submitting syslog data which is malformed in various
ways to convince SyslogUtils.extractEvent to continually fill the
ByteArrayOutputStream it uses to collect the event until the agent runs out of
memory. Since the OOM condition affects the whole agent, it's possible that a
client sending such data (due to accident or malicious intent) to disable the
agent, as long as it remains connected.
Note that this is probably only possible using SyslogTcpSource although the fix
touches common code in SyslogUtils.java.
The issue can happen in two ways:
Scenario 1: Send a message like this:
{{<> some more stuff here}}
This causes a NumberFormatException:
{code}
Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
WARNING: EXCEPTION, please implement
org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() for
proper handling.
java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:504)
at java.lang.Integer.parseInt(Integer.java:527)
at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
at
org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
at
org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
This exception does not get handled, and it happens before reset() can be
called. The result is that the state machine in SyslogUtils gets stuck in the
DATA state, and all subsequent data just gets appended to the baos, while the
above exception streams to the log. Eventually the agent runs out of memory.
Scenario 2: Send some data like this:
{{<123...........}}
No length checking is done in the PRIO state so you could potentially fill the
agent memory this way too.
I'm attaching a patch which handles both of these issues and adds more
exception handling to buildEvent to make sure that reset() is called in future
unforeseen situations.
Thanks also to [~roshan_naik] for helping to make this patch better.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)