Hi,

Thanks for the reply Bryan.

I'd rather not update the logback/log4j because the service is already in
place and for now I just try to fit around the current system. Anyway
according to the RFC, a syslog message must not be longer than 1024 bytes
so a single "event" might be splitted anyway.

I've create NIFI-1392 for that feature. I'm not sure of the process for a
feature request but I'll try to find some times to create a pull request or
a patch for this.


Best regards,
Louis-Etienne

On 8 January 2016 at 12:15, Bryan Bende <[email protected]> wrote:

> Hello,
>
> Glad to hear you are getting started using ListenSyslog!
>
> You are definitely running into something that we should consider
> supporting. The current implementation treats each new-line as the message
> delimiter and places each message on to a queue.
>
> When the processor is triggered, it grabs messages from the queue up to
> the "Max Batch Size". So in the default case it grabs a single message from
> the queue, which in your case is a single line
> from one of the mult-line messages, and produces a FlowFile. When "Max
> Batch Size" is set higher to say 100, it grabs up to 100 messages and
> produces a FlowFile containing all 100 messages.
>
> The messages in the queue are simultaneously coming from all of the
> incoming connections, so this is why you don't see all the lines from one
> server in the same order. Imagine the queue having something like:
>
> java-server-1 message1 line1
> java-server-2 message1 line1
> java-server-1 message1 line2
> java-server-3 message1 line1
> java-server-2 message1 line2
> ....
>
> I would need to dig into that splunk documentation a little more, but I
> think you are right that we could possibly expose some kind of message
> delimiter pattern on the processor which
> would be applied when reading the messages, before they even make into the
> queue, so that by the time it gets put in the queue it would be all of the
> lines from one message.
>
> Given the current situation, there might be one other option for you. Are
> you able to control/change the logback/log4j configuration for the servers
> sending the logs?
>
> If so, a JSON layout might solve the problem. These configuration files
> show how to do that:
> https://github.com/bbende/jsonevent-producer/tree/master/src/main/resources
>
> I know this worked well with the ListenUDP processor to ensure that an
> entire stack trace was sent as a single JSON document, but I have not had a
> chance to try it with ListenSyslog and the SyslogAppender.
> If you are using ListenSyslog with TCP, then it will probably come down to
> whether logback/log4j puts new-lines inside the JSON document, or only a
> single new-line at the end.
>
> -Bryan
>
>
> On Fri, Jan 8, 2016 at 11:36 AM, Louis-Étienne Dorval <[email protected]>
> wrote:
>
>> Hi everyone!
>>
>> I'm looking to use the new ListenSyslog processor in a proof-of-concept
>> [project but I encounter a problem that I can find a suitable solution
>> (yet!).
>> I'm receiving logs from multiple Java-based server using a logback/log4j
>> SyslogAppender. The messages are received successfully but when a stack
>> trace happens, each lines are broken into single FlowFile.
>>
>> I'm trying to achieve something like the following:
>> http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents
>>
>> I tried:
>> - Increasing the "Max Batch Size", but I end up merging lines that should
>> not be merge and there's no way to know then length of the stack trace...
>> - Use MergeContent using the host as "Correlation Attribute Name", but as
>> before I merge lines that should not be merge
>> - Use MergeContent followed by SplitContent, that might work but the
>> SplitContent is pretty restrictive and I can't find a "Byte Sequence" that
>> are different from stack trace.
>>
>> Even if I find a magic "Byte Sequence" for my last try (MergeContent +
>> SplitContent), I would most probably lose a part of the stacktrace as the
>> MergeContent is limited by the "Max Batch Size"
>>
>>
>> The only solution that I see is to modify the ListenSyslog to add some
>> similar parameter as the Splunk documentation explains and use that rather
>> than a fixed "Max Batch Size".
>>
>> Am I missing a another option?
>> Would that be a suitable feature? (maybe I should ask that question in
>> the dev mailing list)
>>
>> Best regards!
>>
>
>

Reply via email to