I have a source SQL table that I'm reading with a SQL select statement.  I want 
to kill and fill a destination SQL table with this source data on an interval.

My non kill-and-fill pattern is: ExecuteSQL -> Avro To JSON -> JSON To SQL -> 
PutSQL.

I'm trying to come up with a good way to delete existing data first before 
loading new data.
One option I've considered is to mark the original Avro file with a UUID and 
add this attribute as a field in the destination table; then do a split off, 
ReplaceText, and delete all rows where the UUID doesn't match this batch.  I 
think this could work, but I'm worried about timing the SQL DELETE.  I kind of 
want the kill and the fill steps to happen in a single transaction.

The other issue is what happens if PutSQL has to go down for a while due to 
database downtime and I get several kill-and-fill batches piled up.  Is there a 
way I can use backpressure to make sure only a single file gets converted from 
JSON to SQL at a time in order to avoid mixing batches?
I also considered FlowFile expiration, but is there a way I can tell it NiFI to 
only expire a FlowFile when a new FlowFile has entered the queue? Ex: 1 flow 
file in queue, no expiration occurs. 2nd (newer) FlowFile enters queue then 
first file will expire itself.

Thanks,
  Peter

Reply via email to