Hi,

I'm trying to replicate a number of tables from one database to another. I'd 
like the flow to take care of both the DDL ("create table if not exists ...") 
and DML ("insert") commands automatically. Ideally, the "create table" should 
be executed just once, before any insert for that same table is executed. 
I can use a Distributed Map Cache to know if a "create table" for each table 
was already performed or not, but the problem is that I don't know how to hold 
the "inserts" for that table until the "create table" is done. 
I'm using "crate table if not exists as select * from ...", so I'm trying to 
create the table and populate it at the same time with the data from that first 
row. It's not a pure "create table" without data because I couldn't find any 
processor that automatically maps the avro.schema to my database's DDL. I could 
use ExecuteScript for that, and then use "create table if not exists <table 
definition>", but how to avoid running the create for every single row (or even 
for every flowfile containing many rows each)? It would be great if I could run 
the "create table" just once per table, with or without data for the first 
"batch".
It looks like a Setup task, if you know what I mean. I'm not sure if that is 
something that would fit how NiFi works, though. Wait and Notify don't look 
like an answer, either.
Probably I'd be better off considering the creation of the table structures as 
a one-time configuration task performed before the flow is first executed, but 
it would be cool to have everything automated using the same toolset, specially 
considering that new tables could be created at any time. (You may assume mine 
is a test database so I don't actually need or want to enforce a more strict 
control on what is going to be created or not. I just want it there, and maybe 
be notified when something new comes up.) 
Suggestions?
Thank you,

Marcio





Reply via email to