Toivo, I started down this path, but then came up with a broader solution (which I have not tested):
1. Do a normal JSONToSQL 2. Use MergeContent to group all of the FlowFiles from the same batch into a single new FlowFile using FlowFile Stream Merge Format. 3. Update PutSQL to support Merged FlowFiles. --Peter From: Toivo Adams [mailto:[email protected]] Sent: Sunday, August 28, 2016 7:27 AM To: [email protected] Subject: Re: Kill-and-Fill Pattern? hi Could new processor PutAvroSQL help? Processor will use data in Avro format and insert all records at once. thanks toivo 2016-08-26 16:45 GMT+03:00 Peter Wicks (pwicks) <[email protected]<mailto:[email protected]>>: I have a source SQL table that I’m reading with a SQL select statement. I want to kill and fill a destination SQL table with this source data on an interval. My non kill-and-fill pattern is: ExecuteSQL -> Avro To JSON -> JSON To SQL -> PutSQL. I’m trying to come up with a good way to delete existing data first before loading new data. One option I’ve considered is to mark the original Avro file with a UUID and add this attribute as a field in the destination table; then do a split off, ReplaceText, and delete all rows where the UUID doesn’t match this batch. I think this could work, but I’m worried about timing the SQL DELETE. I kind of want the kill and the fill steps to happen in a single transaction. The other issue is what happens if PutSQL has to go down for a while due to database downtime and I get several kill-and-fill batches piled up. Is there a way I can use backpressure to make sure only a single file gets converted from JSON to SQL at a time in order to avoid mixing batches? I also considered FlowFile expiration, but is there a way I can tell it NiFI to only expire a FlowFile when a new FlowFile has entered the queue? Ex: 1 flow file in queue, no expiration occurs. 2nd (newer) FlowFile enters queue then first file will expire itself. Thanks, Peter
