Toivo,

I started down this path, but then came up with a broader solution (which I 
have not tested):


1.       Do a normal JSONToSQL

2.       Use MergeContent to group all of the FlowFiles from the same batch 
into a single new FlowFile using FlowFile Stream Merge Format.

3.       Update PutSQL to support Merged FlowFiles.

--Peter

From: Toivo Adams [mailto:[email protected]]
Sent: Sunday, August 28, 2016 7:27 AM
To: [email protected]
Subject: Re: Kill-and-Fill Pattern?

hi
Could new processor PutAvroSQL help?
Processor will use data in Avro format and insert all records at once.
thanks
toivo

2016-08-26 16:45 GMT+03:00 Peter Wicks (pwicks) 
<[email protected]<mailto:[email protected]>>:
I have a source SQL table that I’m reading with a SQL select statement.  I want 
to kill and fill a destination SQL table with this source data on an interval.

My non kill-and-fill pattern is: ExecuteSQL -> Avro To JSON -> JSON To SQL -> 
PutSQL.

I’m trying to come up with a good way to delete existing data first before 
loading new data.
One option I’ve considered is to mark the original Avro file with a UUID and 
add this attribute as a field in the destination table; then do a split off, 
ReplaceText, and delete all rows where the UUID doesn’t match this batch.  I 
think this could work, but I’m worried about timing the SQL DELETE.  I kind of 
want the kill and the fill steps to happen in a single transaction.

The other issue is what happens if PutSQL has to go down for a while due to 
database downtime and I get several kill-and-fill batches piled up.  Is there a 
way I can use backpressure to make sure only a single file gets converted from 
JSON to SQL at a time in order to avoid mixing batches?
I also considered FlowFile expiration, but is there a way I can tell it NiFI to 
only expire a FlowFile when a new FlowFile has entered the queue? Ex: 1 flow 
file in queue, no expiration occurs. 2nd (newer) FlowFile enters queue then 
first file will expire itself.

Thanks,
  Peter

Reply via email to