Mike Moser: Great thinking! Bryan
Taken from listen udp docs: "This processor listens for Datagram Packets on a given port and concatenates the contents of those packets together generating flow files roughly as often as the internal buffer fills up or until no more data is currently available." Quite honestly when this processor was originally built NiFi didn't have the ability to do the sort of fancy 'slab allocation' mechanism it supports today when generating a stream of flow files. So we could probably pretty easily reimplement this to behave more like you were thinking it should. But it is probably worth a bit of discussion/exploration to see what makes sense. The case we built it for was data arriving in UDP packets and it was structured in such a way that simple binary concatenation was sufficient because the data was inherently demarcatable/stream processing friendly. We could, however, implement it now such that each UDP datagram becomes a flow file. But not sure that makes sense either. This is sort of the inherent challenge of providing a raw socket listener. If the 'thing' being exchanged is not clear then we're not sure what the boundary of a given flow file should be. I'll stop rambling: Please if you would describe the use case a bit more we can think about whether providing a mode of 'datagram = flowfile' makes sense. Thanks! Joe On Fri, Apr 24, 2015 at 7:44 PM, Bryan Bende <bbe...@gmail.com> wrote: > Thanks for the suggestions... looks like it is in fact coming out of > ListenUDP like that. I'll try to figure out if this is expected behavior, > or possibly something with how the messages are being sent. > > Sorry for the false alarm about MergeContent. > > On Fri, Apr 24, 2015 at 9:48 AM, Michael Moser <moser...@gmail.com> wrote: > >> At first glance, I would suspect ListenUDP is placing more than one UDP >> datagram into one flowfile. It might be worth spending some time checking >> if that can happen. >> >> -- Mike >> >> >> On Thu, Apr 23, 2015 at 9:35 PM, Joe Witt <joe.w...@gmail.com> wrote: >> >> > Are you sure you're not sending the [ , ] over UDP as well ;-) >> > >> > Can you create a template of your flow and send it over? Perhaps just >> > attach to a JIRA for this. MergeContent is a powerful and useful >> > thing so if you're seeing funky behavior we want to sort it out >> > quickly. >> > >> > On Thu, Apr 23, 2015 at 8:47 PM, Bryan Bende <bbe...@gmail.com> wrote: >> > > I'm trying to use MergeContent to merge json documents. I have the >> > Header. >> > > Demarcator, and Footer properties pointing to files with [ , ] >> > > respectively. I left all other properties the same, and set Max Entries >> > to >> > > 5 and Max Bin Age to 10 seconds. >> > > >> > > I have a simple flow with ListenUDP -> MergeContent -> >> > PutSolrContentStream >> > > (from the pull request). If I send a bunch of json documents over UDP, >> > most >> > > of them will merge correctly, but I'll see a couple where the >> demarcator >> > > didn't get inserted between two json documents. >> > > >> > > Any thoughts as to why this would happen? >> > > >> > > I added a significant amount of logging to the >> getDescriptorFileContent() >> > > method in MergeContent to see if there was a reason why it would return >> > > null for the demarcator, but nothing obvious is really jumping out at >> me. >> > >>