Syslog: https://issues.apache.org/jira/browse/NIFI-274
UDP Update: https://issues.apache.org/jira/browse/NIFI-548 On Sat, Apr 25, 2015 at 9:15 PM, Joe Witt <joe.w...@gmail.com> wrote: > Roger that. And I think adding a property to listen udp to treat > datagrams as flowfiles rather than a set of datagrams as a flowfile > would be very doable. > > On Sat, Apr 25, 2015 at 9:12 PM, Joey Echeverria <joe...@gmail.com> wrote: >> A syslog processor would be useful for log aggregation. I'm pretty sure >> that log4j, etc. have native syslog appenders. >> >> -Joey >> On Sat, Apr 25, 2015 at 12:13 Bryan Bende <bbe...@gmail.com> wrote: >> >>> Joe, >>> >>> Thanks for the background on ListenUDP. >>> >>> The use case I was thinking of was log aggregation... most logging >>> frameworks like logback, log4j, etc., have a UDP appender, and they also >>> generally have a json format/layout that conforms with the "logstash" >>> format. I was thinking it would be cool to be able to use NiFi as an >>> alternative to logstash, flume, and whatever other technologies are being >>> used to get logs into a central location. There are obviously other options >>> besides udp, but it seemed easy and well supported. >>> >>> Maybe a property on the processor could control whether or not it buffered >>> datagrams vs producing a new FlowFile for each datagram? >>> >>> -Bryan >>> >>> >>> >>> On Fri, Apr 24, 2015 at 8:45 PM, Joe Witt <joe.w...@gmail.com> wrote: >>> >>> > Mike Moser: Great thinking! >>> > >>> > Bryan >>> > >>> > Taken from listen udp docs: "This processor listens for Datagram >>> > Packets on a given port and concatenates the contents of those packets >>> > together generating flow files roughly as often as the internal buffer >>> > fills up or until no more data is currently available." >>> > >>> > Quite honestly when this processor was originally built NiFi didn't >>> > have the ability to do the sort of fancy 'slab allocation' mechanism >>> > it supports today when generating a stream of flow files. So we could >>> > probably pretty easily reimplement this to behave more like you were >>> > thinking it should. But it is probably worth a bit of >>> > discussion/exploration to see what makes sense. The case we built it >>> > for was data arriving in UDP packets and it was structured in such a >>> > way that simple binary concatenation was sufficient because the data >>> > was inherently demarcatable/stream processing friendly. We could, >>> > however, implement it now such that each UDP datagram becomes a flow >>> > file. But not sure that makes sense either. This is sort of the >>> > inherent challenge of providing a raw socket listener. If the 'thing' >>> > being exchanged is not clear then we're not sure what the boundary of >>> > a given flow file should be. >>> > >>> > I'll stop rambling: Please if you would describe the use case a bit >>> > more we can think about whether providing a mode of 'datagram = >>> > flowfile' makes sense. >>> > >>> > Thanks! >>> > Joe >>> > >>> > On Fri, Apr 24, 2015 at 7:44 PM, Bryan Bende <bbe...@gmail.com> wrote: >>> > > Thanks for the suggestions... looks like it is in fact coming out of >>> > > ListenUDP like that. I'll try to figure out if this is expected >>> behavior, >>> > > or possibly something with how the messages are being sent. >>> > > >>> > > Sorry for the false alarm about MergeContent. >>> > > >>> > > On Fri, Apr 24, 2015 at 9:48 AM, Michael Moser <moser...@gmail.com> >>> > wrote: >>> > > >>> > >> At first glance, I would suspect ListenUDP is placing more than one >>> UDP >>> > >> datagram into one flowfile. It might be worth spending some time >>> > checking >>> > >> if that can happen. >>> > >> >>> > >> -- Mike >>> > >> >>> > >> >>> > >> On Thu, Apr 23, 2015 at 9:35 PM, Joe Witt <joe.w...@gmail.com> wrote: >>> > >> >>> > >> > Are you sure you're not sending the [ , ] over UDP as well ;-) >>> > >> > >>> > >> > Can you create a template of your flow and send it over? Perhaps >>> just >>> > >> > attach to a JIRA for this. MergeContent is a powerful and useful >>> > >> > thing so if you're seeing funky behavior we want to sort it out >>> > >> > quickly. >>> > >> > >>> > >> > On Thu, Apr 23, 2015 at 8:47 PM, Bryan Bende <bbe...@gmail.com> >>> > wrote: >>> > >> > > I'm trying to use MergeContent to merge json documents. I have the >>> > >> > Header. >>> > >> > > Demarcator, and Footer properties pointing to files with [ , ] >>> > >> > > respectively. I left all other properties the same, and set Max >>> > Entries >>> > >> > to >>> > >> > > 5 and Max Bin Age to 10 seconds. >>> > >> > > >>> > >> > > I have a simple flow with ListenUDP -> MergeContent -> >>> > >> > PutSolrContentStream >>> > >> > > (from the pull request). If I send a bunch of json documents over >>> > UDP, >>> > >> > most >>> > >> > > of them will merge correctly, but I'll see a couple where the >>> > >> demarcator >>> > >> > > didn't get inserted between two json documents. >>> > >> > > >>> > >> > > Any thoughts as to why this would happen? >>> > >> > > >>> > >> > > I added a significant amount of logging to the >>> > >> getDescriptorFileContent() >>> > >> > > method in MergeContent to see if there was a reason why it would >>> > return >>> > >> > > null for the demarcator, but nothing obvious is really jumping out >>> > at >>> > >> me. >>> > >> > >>> > >> >>> > >>>