Regarding what Joe pointed out, the existing property for "Batching Message Delimiter" is the delimiter between messages when written to a flow file, aka the outbound delimiter. The delimiter when reading off the channel is hard-coded here and here:
https://github.com/apache/nifi/blob/master/nifi-commons/nifi-processor-utilities/src/main/java/org/apache/nifi/processor/util/listen/handler/socket/StandardSocketChannelHandler.java#L155 https://github.com/apache/nifi/blob/master/nifi-commons/nifi-processor-utilities/src/main/java/org/apache/nifi/processor/util/listen/handler/socket/SSLSocketChannelHandler.java#L150 I'm not really sure why it would be breaking up a 50-character line across multiple flow files... are there definitely no '\n' characters within those 50 characters? On Wed, Jan 25, 2017 at 6:00 PM, Raymond Rogers < [email protected]> wrote: > Brain, > > I looked like I should be getting a single line or complete line messages > in the flow-files, which under a light load I do get. When I increase the > message rate to what the production world would look like I am seeing lines > get chopped into little pieces (one 50-character line may end up in 3-4 > flow-files.) > > > > > *Raymond Rogers*Senior Embedded Software Engineer > > 15301 N. Dallas Pkwy Suite 500 > Dallas, TX 75001 > D: +1 972 744 3928 <(972)%20744-3928> > rmgnetworks.com <http://www.rmgnetworks.com> > > > > *From:* Bryan Bende [mailto:[email protected]] > *Sent:* Wednesday, January 25, 2017 4:48 PM > *To:* [email protected] > *Subject:* Re: ListenTCP to receive CSV stream. > > > > Raymond, > > > > Currently ListenTCP uses new line characters to determine logical message > boundaries, and coming out of the processor you can either have 1 logical > message per flow file, or batch together a configurable number of logical > messages into 1 flow file which would be more performant. > > > > In your case it sounds like you would want to read data until seeing the > "end of data" marker and treat the whole CSV as one logical message. There > is a JIRA to add this capability: https://issues. > apache.org/jira/browse/NIFI-1985 > > > > I think the best you can do currently is to us a MergeContent processor > somewhere after ListenTCP to merge together the individual lines from the > CSV, but since there is not other information available to tell it how many > total lines there are, it can't guarantee that they are all merged together > in one flow file. You might be able to make some assumptions about the > timing and size of the data and configure MergeContent in such a way that > it should usually get you the whole CSV as one file. > > > > Hope this helps. > > > > -Bryan > > > > On Wed, Jan 25, 2017 at 5:18 PM, Raymond Rogers < > [email protected]> wrote: > > I'm still new to NiFi and I'm trying to receive text stream containing a > CSV file of an unknown length (anything from ~100 bytes to almost 300 KB) > over a TCP socket. The CSV does have an "end of data" marker that I can > look for but I am unsure of how to accumulate the text until I receive the > marker and create a flow-file that contains all of the data up to that > point. > > > > The data is being sent from an application that cannot changed to use a > different format. > > > > Any suggestions? > > > > > *Raymond Rogers* > Senior Embedded Software Engineer > > 15301 N. Dallas Pkwy Suite 500 > Dallas, TX 75001 > D: +1 972 744 3928 <(972)%20744-3928> > rmgnetworks.com <http://www.rmgnetworks.com> > > Notice of Confidentiality: This transmission contains information that may > be confidential and that may also be privileged. Unless you are the > intended recipient of the message (or authorized to receive it for the > intended recipient) you may not copy, forward, or otherwise use it, or > disclose its contents to anyone else. If you have received this > transmission in error, please notify us immediately and delete it from your > system. > > >
