Phil D'Amore created FLUME-2798:
-----------------------------------

             Summary: Malformed Syslog messages can lead to OutOfMemoryException
                 Key: FLUME-2798
                 URL: https://issues.apache.org/jira/browse/FLUME-2798
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.6.0, v1.5.0, v1.4.0
            Reporter: Phil D'Amore
            Priority: Critical


It's possible for a client submitting syslog data which is malformed in various 
ways to convince SyslogUtils.extractEvent to continually fill the 
ByteArrayOutputStream it uses to collect the event until the agent runs out of 
memory.  Since the OOM condition affects the whole agent, it's possible that a 
client sending such data (due to accident or malicious intent) to disable the 
agent, as long as it remains connected.

Note that this is probably only possible using SyslogTcpSource although the fix 
touches common code in SyslogUtils.java.

The issue can happen in two ways:

Scenario 1: Send a message like this:

{{<> some more stuff here}}

This causes a NumberFormatException:

{code}
Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
WARNING: EXCEPTION, please implement 
org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() for 
proper handling.
java.lang.NumberFormatException: For input string: ""
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:504)
        at java.lang.Integer.parseInt(Integer.java:527)
        at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
        at 
org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
        at 
org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{code}

This exception does not get handled, and it happens before reset() can be 
called.  The result is that the state machine in SyslogUtils gets stuck in the 
DATA state, and all subsequent data just gets appended to the baos, while the 
above exception streams to the log.  Eventually the agent runs out of memory.

Scenario 2: Send some data like this:

{{<123...........}}

No length checking is done in the PRIO state so you could potentially fill the 
agent memory this way too.

I'm attaching a patch which handles both of these issues and adds more 
exception handling to buildEvent to make sure that reset() is called in future 
unforeseen situations.

Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to