Re: Merge ListenSyslog events

2016-01-08 Thread Bryan Bende
Hello,

Glad to hear you are getting started using ListenSyslog!

You are definitely running into something that we should consider
supporting. The current implementation treats each new-line as the message
delimiter and places each message on to a queue.

When the processor is triggered, it grabs messages from the queue up to the
"Max Batch Size". So in the default case it grabs a single message from the
queue, which in your case is a single line
from one of the mult-line messages, and produces a FlowFile. When "Max
Batch Size" is set higher to say 100, it grabs up to 100 messages and
produces a FlowFile containing all 100 messages.

The messages in the queue are simultaneously coming from all of the
incoming connections, so this is why you don't see all the lines from one
server in the same order. Imagine the queue having something like:

java-server-1 message1 line1
java-server-2 message1 line1
java-server-1 message1 line2
java-server-3 message1 line1
java-server-2 message1 line2


I would need to dig into that splunk documentation a little more, but I
think you are right that we could possibly expose some kind of message
delimiter pattern on the processor which
would be applied when reading the messages, before they even make into the
queue, so that by the time it gets put in the queue it would be all of the
lines from one message.

Given the current situation, there might be one other option for you. Are
you able to control/change the logback/log4j configuration for the servers
sending the logs?

If so, a JSON layout might solve the problem. These configuration files
show how to do that:
https://github.com/bbende/jsonevent-producer/tree/master/src/main/resources

I know this worked well with the ListenUDP processor to ensure that an
entire stack trace was sent as a single JSON document, but I have not had a
chance to try it with ListenSyslog and the SyslogAppender.
If you are using ListenSyslog with TCP, then it will probably come down to
whether logback/log4j puts new-lines inside the JSON document, or only a
single new-line at the end.

-Bryan


On Fri, Jan 8, 2016 at 11:36 AM, Louis-Étienne Dorval 
wrote:

> Hi everyone!
>
> I'm looking to use the new ListenSyslog processor in a proof-of-concept
> [project but I encounter a problem that I can find a suitable solution
> (yet!).
> I'm receiving logs from multiple Java-based server using a logback/log4j
> SyslogAppender. The messages are received successfully but when a stack
> trace happens, each lines are broken into single FlowFile.
>
> I'm trying to achieve something like the following:
> http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents
>
> I tried:
> - Increasing the "Max Batch Size", but I end up merging lines that should
> not be merge and there's no way to know then length of the stack trace...
> - Use MergeContent using the host as "Correlation Attribute Name", but as
> before I merge lines that should not be merge
> - Use MergeContent followed by SplitContent, that might work but the
> SplitContent is pretty restrictive and I can't find a "Byte Sequence" that
> are different from stack trace.
>
> Even if I find a magic "Byte Sequence" for my last try (MergeContent +
> SplitContent), I would most probably lose a part of the stacktrace as the
> MergeContent is limited by the "Max Batch Size"
>
>
> The only solution that I see is to modify the ListenSyslog to add some
> similar parameter as the Splunk documentation explains and use that rather
> than a fixed "Max Batch Size".
>
> Am I missing a another option?
> Would that be a suitable feature? (maybe I should ask that question in the
> dev mailing list)
>
> Best regards!
>


Merge ListenSyslog events

2016-01-08 Thread Louis-Étienne Dorval
Hi everyone!

I'm looking to use the new ListenSyslog processor in a proof-of-concept
[project but I encounter a problem that I can find a suitable solution
(yet!).
I'm receiving logs from multiple Java-based server using a logback/log4j
SyslogAppender. The messages are received successfully but when a stack
trace happens, each lines are broken into single FlowFile.

I'm trying to achieve something like the following:
http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents

I tried:
- Increasing the "Max Batch Size", but I end up merging lines that should
not be merge and there's no way to know then length of the stack trace...
- Use MergeContent using the host as "Correlation Attribute Name", but as
before I merge lines that should not be merge
- Use MergeContent followed by SplitContent, that might work but the
SplitContent is pretty restrictive and I can't find a "Byte Sequence" that
are different from stack trace.

Even if I find a magic "Byte Sequence" for my last try (MergeContent +
SplitContent), I would most probably lose a part of the stacktrace as the
MergeContent is limited by the "Max Batch Size"


The only solution that I see is to modify the ListenSyslog to add some
similar parameter as the Splunk documentation explains and use that rather
than a fixed "Max Batch Size".

Am I missing a another option?
Would that be a suitable feature? (maybe I should ask that question in the
dev mailing list)

Best regards!