Hi, Thanks for the reply Bryan.
I'd rather not update the logback/log4j because the service is already in place and for now I just try to fit around the current system. Anyway according to the RFC, a syslog message must not be longer than 1024 bytes so a single "event" might be splitted anyway. I've create NIFI-1392 for that feature. I'm not sure of the process for a feature request but I'll try to find some times to create a pull request or a patch for this. Best regards, Louis-Etienne On 8 January 2016 at 12:15, Bryan Bende <[email protected]> wrote: > Hello, > > Glad to hear you are getting started using ListenSyslog! > > You are definitely running into something that we should consider > supporting. The current implementation treats each new-line as the message > delimiter and places each message on to a queue. > > When the processor is triggered, it grabs messages from the queue up to > the "Max Batch Size". So in the default case it grabs a single message from > the queue, which in your case is a single line > from one of the mult-line messages, and produces a FlowFile. When "Max > Batch Size" is set higher to say 100, it grabs up to 100 messages and > produces a FlowFile containing all 100 messages. > > The messages in the queue are simultaneously coming from all of the > incoming connections, so this is why you don't see all the lines from one > server in the same order. Imagine the queue having something like: > > java-server-1 message1 line1 > java-server-2 message1 line1 > java-server-1 message1 line2 > java-server-3 message1 line1 > java-server-2 message1 line2 > .... > > I would need to dig into that splunk documentation a little more, but I > think you are right that we could possibly expose some kind of message > delimiter pattern on the processor which > would be applied when reading the messages, before they even make into the > queue, so that by the time it gets put in the queue it would be all of the > lines from one message. > > Given the current situation, there might be one other option for you. Are > you able to control/change the logback/log4j configuration for the servers > sending the logs? > > If so, a JSON layout might solve the problem. These configuration files > show how to do that: > https://github.com/bbende/jsonevent-producer/tree/master/src/main/resources > > I know this worked well with the ListenUDP processor to ensure that an > entire stack trace was sent as a single JSON document, but I have not had a > chance to try it with ListenSyslog and the SyslogAppender. > If you are using ListenSyslog with TCP, then it will probably come down to > whether logback/log4j puts new-lines inside the JSON document, or only a > single new-line at the end. > > -Bryan > > > On Fri, Jan 8, 2016 at 11:36 AM, Louis-Étienne Dorval <[email protected]> > wrote: > >> Hi everyone! >> >> I'm looking to use the new ListenSyslog processor in a proof-of-concept >> [project but I encounter a problem that I can find a suitable solution >> (yet!). >> I'm receiving logs from multiple Java-based server using a logback/log4j >> SyslogAppender. The messages are received successfully but when a stack >> trace happens, each lines are broken into single FlowFile. >> >> I'm trying to achieve something like the following: >> http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents >> >> I tried: >> - Increasing the "Max Batch Size", but I end up merging lines that should >> not be merge and there's no way to know then length of the stack trace... >> - Use MergeContent using the host as "Correlation Attribute Name", but as >> before I merge lines that should not be merge >> - Use MergeContent followed by SplitContent, that might work but the >> SplitContent is pretty restrictive and I can't find a "Byte Sequence" that >> are different from stack trace. >> >> Even if I find a magic "Byte Sequence" for my last try (MergeContent + >> SplitContent), I would most probably lose a part of the stacktrace as the >> MergeContent is limited by the "Max Batch Size" >> >> >> The only solution that I see is to modify the ListenSyslog to add some >> similar parameter as the Splunk documentation explains and use that rather >> than a fixed "Max Batch Size". >> >> Am I missing a another option? >> Would that be a suitable feature? (maybe I should ask that question in >> the dev mailing list) >> >> Best regards! >> > >
