Brain, I looked like I should be getting a single line or complete line messages in the flow-files, which under a light load I do get. When I increase the message rate to what the production world would look like I am seeing lines get chopped into little pieces (one 50-character line may end up in 3-4 flow-files.)
Raymond Rogers Senior Embedded Software Engineer 15301 N. Dallas Pkwy Suite 500 Dallas, TX 75001 D: +1 972 744 3928 rmgnetworks.com<http://www.rmgnetworks.com> [cid:RMG_Logo_EmailSig_a7107ed9-9d13-42cc-b797-b75f7cb2a204.jpg] From: Bryan Bende [mailto:[email protected]] Sent: Wednesday, January 25, 2017 4:48 PM To: [email protected] Subject: Re: ListenTCP to receive CSV stream. Raymond, Currently ListenTCP uses new line characters to determine logical message boundaries, and coming out of the processor you can either have 1 logical message per flow file, or batch together a configurable number of logical messages into 1 flow file which would be more performant. In your case it sounds like you would want to read data until seeing the "end of data" marker and treat the whole CSV as one logical message. There is a JIRA to add this capability: https://issues.apache.org/jira/browse/NIFI-1985 I think the best you can do currently is to us a MergeContent processor somewhere after ListenTCP to merge together the individual lines from the CSV, but since there is not other information available to tell it how many total lines there are, it can't guarantee that they are all merged together in one flow file. You might be able to make some assumptions about the timing and size of the data and configure MergeContent in such a way that it should usually get you the whole CSV as one file. Hope this helps. -Bryan On Wed, Jan 25, 2017 at 5:18 PM, Raymond Rogers <[email protected]<mailto:[email protected]>> wrote: I'm still new to NiFi and I'm trying to receive text stream containing a CSV file of an unknown length (anything from ~100 bytes to almost 300 KB) over a TCP socket. The CSV does have an "end of data" marker that I can look for but I am unsure of how to accumulate the text until I receive the marker and create a flow-file that contains all of the data up to that point. The data is being sent from an application that cannot changed to use a different format. Any suggestions? Raymond Rogers Senior Embedded Software Engineer 15301 N. Dallas Pkwy Suite 500 Dallas, TX 75001 D: +1 972 744 3928<tel:(972)%20744-3928> rmgnetworks.com<http://www.rmgnetworks.com> [cid:[email protected]] Notice of Confidentiality: This transmission contains information that may be confidential and that may also be privileged. Unless you are the intended recipient of the message (or authorized to receive it for the intended recipient) you may not copy, forward, or otherwise use it, or disclose its contents to anyone else. If you have received this transmission in error, please notify us immediately and delete it from your system.
