Tim, Based on what you describe, and not being familiar with Kafka or your application, it sounds like breaking each row into a flowfile could make sense, depending upon what you're needing to do downstream. There is overhead associated with each FlowFile, as well as a provenance consideration for what level of granularity you want for the flows. If there's a more logical way to group multiple JSON objects together as multiple rows that may be more efficient.
For throughput reasons, if you have a huge number of rows converting to separate flowfiles, you may want to consider "batching" flowfile creation within your processor (look at how GetFile does this, for example). This way, each time your processor's onTrigger method gets called, your processor can quickly process and emit NNN number of JSON objects then relinquish control. You said the incoming text file is "very large" - not sure if that's in MB's, GB's or TB. Keep in mind that it will have to be read entirely into the content repository by GetFile before processing, and then your processor will have to deal with streaming that huge file in line by line, parsing and creating the JSON objects. Not sure if you can accomplish this using the standard Nifi building blocks and expression language, but might be possible. Hope that helps. Rick -----Original Message----- From: timF [mailto:[email protected]] Sent: Saturday, September 12, 2015 1:50 AM To: [email protected] Subject: custom processor - parse flowFile to many kafka messages I need to create a custom processor. GetFile --> MyProcessor --> PutKafka The incoming flowFile will be a very large text file. Each row of the file will need to be parsed, put into its own json object, and then sent to a kafka topic. My question is the following: Do I need to write each JSON object to its own output flowFile. That is if the input file contains N rows, and I want N messages to show up in the kafka topic, do I create N output flowFiles ? -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/custom-processor-parse-flowFile-to-many-kafka-messages-tp2782.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
