Roger that. And I think adding a property to listen udp to treat datagrams as flowfiles rather than a set of datagrams as a flowfile would be very doable.
On Sat, Apr 25, 2015 at 9:12 PM, Joey Echeverria <joe...@gmail.com> wrote: > A syslog processor would be useful for log aggregation. I'm pretty sure > that log4j, etc. have native syslog appenders. > > -Joey > On Sat, Apr 25, 2015 at 12:13 Bryan Bende <bbe...@gmail.com> wrote: > >> Joe, >> >> Thanks for the background on ListenUDP. >> >> The use case I was thinking of was log aggregation... most logging >> frameworks like logback, log4j, etc., have a UDP appender, and they also >> generally have a json format/layout that conforms with the "logstash" >> format. I was thinking it would be cool to be able to use NiFi as an >> alternative to logstash, flume, and whatever other technologies are being >> used to get logs into a central location. There are obviously other options >> besides udp, but it seemed easy and well supported. >> >> Maybe a property on the processor could control whether or not it buffered >> datagrams vs producing a new FlowFile for each datagram? >> >> -Bryan >> >> >> >> On Fri, Apr 24, 2015 at 8:45 PM, Joe Witt <joe.w...@gmail.com> wrote: >> >> > Mike Moser: Great thinking! >> > >> > Bryan >> > >> > Taken from listen udp docs: "This processor listens for Datagram >> > Packets on a given port and concatenates the contents of those packets >> > together generating flow files roughly as often as the internal buffer >> > fills up or until no more data is currently available." >> > >> > Quite honestly when this processor was originally built NiFi didn't >> > have the ability to do the sort of fancy 'slab allocation' mechanism >> > it supports today when generating a stream of flow files. So we could >> > probably pretty easily reimplement this to behave more like you were >> > thinking it should. But it is probably worth a bit of >> > discussion/exploration to see what makes sense. The case we built it >> > for was data arriving in UDP packets and it was structured in such a >> > way that simple binary concatenation was sufficient because the data >> > was inherently demarcatable/stream processing friendly. We could, >> > however, implement it now such that each UDP datagram becomes a flow >> > file. But not sure that makes sense either. This is sort of the >> > inherent challenge of providing a raw socket listener. If the 'thing' >> > being exchanged is not clear then we're not sure what the boundary of >> > a given flow file should be. >> > >> > I'll stop rambling: Please if you would describe the use case a bit >> > more we can think about whether providing a mode of 'datagram = >> > flowfile' makes sense. >> > >> > Thanks! >> > Joe >> > >> > On Fri, Apr 24, 2015 at 7:44 PM, Bryan Bende <bbe...@gmail.com> wrote: >> > > Thanks for the suggestions... looks like it is in fact coming out of >> > > ListenUDP like that. I'll try to figure out if this is expected >> behavior, >> > > or possibly something with how the messages are being sent. >> > > >> > > Sorry for the false alarm about MergeContent. >> > > >> > > On Fri, Apr 24, 2015 at 9:48 AM, Michael Moser <moser...@gmail.com> >> > wrote: >> > > >> > >> At first glance, I would suspect ListenUDP is placing more than one >> UDP >> > >> datagram into one flowfile. It might be worth spending some time >> > checking >> > >> if that can happen. >> > >> >> > >> -- Mike >> > >> >> > >> >> > >> On Thu, Apr 23, 2015 at 9:35 PM, Joe Witt <joe.w...@gmail.com> wrote: >> > >> >> > >> > Are you sure you're not sending the [ , ] over UDP as well ;-) >> > >> > >> > >> > Can you create a template of your flow and send it over? Perhaps >> just >> > >> > attach to a JIRA for this. MergeContent is a powerful and useful >> > >> > thing so if you're seeing funky behavior we want to sort it out >> > >> > quickly. >> > >> > >> > >> > On Thu, Apr 23, 2015 at 8:47 PM, Bryan Bende <bbe...@gmail.com> >> > wrote: >> > >> > > I'm trying to use MergeContent to merge json documents. I have the >> > >> > Header. >> > >> > > Demarcator, and Footer properties pointing to files with [ , ] >> > >> > > respectively. I left all other properties the same, and set Max >> > Entries >> > >> > to >> > >> > > 5 and Max Bin Age to 10 seconds. >> > >> > > >> > >> > > I have a simple flow with ListenUDP -> MergeContent -> >> > >> > PutSolrContentStream >> > >> > > (from the pull request). If I send a bunch of json documents over >> > UDP, >> > >> > most >> > >> > > of them will merge correctly, but I'll see a couple where the >> > >> demarcator >> > >> > > didn't get inserted between two json documents. >> > >> > > >> > >> > > Any thoughts as to why this would happen? >> > >> > > >> > >> > > I added a significant amount of logging to the >> > >> getDescriptorFileContent() >> > >> > > method in MergeContent to see if there was a reason why it would >> > return >> > >> > > null for the demarcator, but nothing obvious is really jumping out >> > at >> > >> me. >> > >> > >> > >> >> > >>