Roger that.  And I think adding a property to listen udp to treat
datagrams as flowfiles rather than a set of datagrams as a flowfile
would be very doable.

On Sat, Apr 25, 2015 at 9:12 PM, Joey Echeverria <joe...@gmail.com> wrote:
> A syslog processor would be useful for log aggregation. I'm pretty sure
> that log4j, etc. have native syslog appenders.
>
> -Joey
> On Sat, Apr 25, 2015 at 12:13 Bryan Bende <bbe...@gmail.com> wrote:
>
>> Joe,
>>
>> Thanks for the background on ListenUDP.
>>
>> The use case I was thinking of was log aggregation... most logging
>> frameworks like logback, log4j, etc., have a UDP appender, and they also
>> generally have a json format/layout that conforms with the "logstash"
>> format. I was thinking it would be cool to be able to use NiFi as an
>> alternative to logstash, flume, and whatever other technologies are being
>> used to get logs into a central location. There are obviously other options
>> besides udp, but it seemed easy and well supported.
>>
>> Maybe a property on the processor could control whether or not it buffered
>> datagrams vs producing a new FlowFile for each datagram?
>>
>> -Bryan
>>
>>
>>
>> On Fri, Apr 24, 2015 at 8:45 PM, Joe Witt <joe.w...@gmail.com> wrote:
>>
>> > Mike Moser: Great thinking!
>> >
>> > Bryan
>> >
>> > Taken from listen udp docs:  "This processor listens for Datagram
>> > Packets on a given port and concatenates the contents of those packets
>> > together generating flow files roughly as often as the internal buffer
>> > fills up or until no more data is currently available."
>> >
>> > Quite honestly when this processor was originally built NiFi didn't
>> > have the ability to do the sort of fancy 'slab allocation' mechanism
>> > it supports today when generating a stream of flow files.  So we could
>> > probably pretty easily reimplement this to behave more like you were
>> > thinking it should.  But it is probably worth a bit of
>> > discussion/exploration to see what makes sense.  The case we built it
>> > for was data arriving in UDP packets and it was structured in such a
>> > way that simple binary concatenation was sufficient because the data
>> > was inherently demarcatable/stream processing friendly.  We could,
>> > however, implement it now such that each UDP datagram becomes a flow
>> > file.  But not sure that makes sense either.  This is sort of the
>> > inherent challenge of providing a raw socket listener.  If the 'thing'
>> > being exchanged is not clear then we're not sure what the boundary of
>> > a given flow file should be.
>> >
>> > I'll stop rambling: Please if you would describe the use case a bit
>> > more we can think about whether providing a mode of 'datagram =
>> > flowfile' makes sense.
>> >
>> > Thanks!
>> > Joe
>> >
>> > On Fri, Apr 24, 2015 at 7:44 PM, Bryan Bende <bbe...@gmail.com> wrote:
>> > > Thanks for the suggestions... looks like it is in fact coming out of
>> > > ListenUDP like that. I'll try to figure out if this is expected
>> behavior,
>> > > or possibly something with how the messages are being sent.
>> > >
>> > > Sorry for the false alarm about MergeContent.
>> > >
>> > > On Fri, Apr 24, 2015 at 9:48 AM, Michael Moser <moser...@gmail.com>
>> > wrote:
>> > >
>> > >> At first glance, I would suspect ListenUDP is placing more than one
>> UDP
>> > >> datagram into one flowfile.  It might be worth spending some time
>> > checking
>> > >> if that can happen.
>> > >>
>> > >> -- Mike
>> > >>
>> > >>
>> > >> On Thu, Apr 23, 2015 at 9:35 PM, Joe Witt <joe.w...@gmail.com> wrote:
>> > >>
>> > >> > Are you sure you're not sending the [ , ] over UDP as well ;-)
>> > >> >
>> > >> > Can you create a template of your flow and send it over?  Perhaps
>> just
>> > >> > attach to a JIRA for this.  MergeContent is a powerful and useful
>> > >> > thing so if you're seeing funky behavior we want to sort it out
>> > >> > quickly.
>> > >> >
>> > >> > On Thu, Apr 23, 2015 at 8:47 PM, Bryan Bende <bbe...@gmail.com>
>> > wrote:
>> > >> > > I'm trying to use MergeContent to merge json documents. I have the
>> > >> > Header.
>> > >> > > Demarcator, and Footer properties pointing to files with [ , ]
>> > >> > > respectively. I left all other properties the same, and set Max
>> > Entries
>> > >> > to
>> > >> > > 5 and Max Bin Age to 10 seconds.
>> > >> > >
>> > >> > > I have a simple flow with ListenUDP -> MergeContent ->
>> > >> > PutSolrContentStream
>> > >> > > (from the pull request). If I send a bunch of json documents over
>> > UDP,
>> > >> > most
>> > >> > > of them will merge correctly, but I'll see a couple where the
>> > >> demarcator
>> > >> > > didn't get inserted between two json documents.
>> > >> > >
>> > >> > > Any thoughts as to why this would happen?
>> > >> > >
>> > >> > > I added a significant amount of logging to the
>> > >> getDescriptorFileContent()
>> > >> > > method in MergeContent to see if there was a reason why it would
>> > return
>> > >> > > null for the demarcator, but nothing obvious is really jumping out
>> > at
>> > >> me.
>> > >> >
>> > >>
>> >
>>

Reply via email to