Thanks for all the feedback. Looking at the source code for SplitText, I see that it parses the input FlowFile, storing the created output FlowFiles in a list, and then at the end sends the list all at once with a single call to session.transfer(). This could be a problem when there are millions of records in the input file.
Is there a technical reason why SplitText creates all the output flow files before sending them out? If I were to write my own split process, or a combination of GetFile and SplitText where I read the input file line by line, can I create an output flow file, send it out, then create the next one, send it out, etc? Does the next processor in the flow get the flow file as soon as it is sent with session.transfer? -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/custom-processor-parse-flowFile-to-many-kafka-messages-tp2782p2803.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
