Question about NiFi and bulk database inserts - is PutSQL the only out of the box option?

Kuhfahl, Bob Thu, 09 Aug 2018 07:52:22 -0700

I’m trying to get bulk inserts going using PutSQL processor but it’s starting 
to get ugly so I need to reach out and see if any of you have been down this 
path.

If you have, here’s some info. If not, thanks for reading this far ☺

Background:
Legacy database migration ETL task. Extract from one database, do a bunch of
transformations, then load it all into a postgresql repo.
We have 100’s of tables with obviously many record structures _and a ton of
data_.

According to:
https://community.hortonworks.com/articles/91849/design-nifi-flow-for-using-putsql-processor-to-per.html

PutSQL, to do batch inserts, seems to want the form of the SQL statement to be
identical for each record type.
e.g. Insert into Employee ("name", "job title") VALUES (?,?)

Easy enough to build that but then it needs attributes for all the values and
types in the flow.
e.g.
1. sql.args.1.value = Bryan B
2. sql.args.2.value = Director
Use Update Attribute Processor to set sql.args.N.type Flow file attributes
1. sql.args.1.type = 12 (VARCHAR)
2. sql.args.2.type = 12

THIS implies my flow will need to create a couple attributes for every single
field in the dataflow – AND I’ll have to come up with logic to determine what
the data type is…

I’m a newbie at this nifi stuff but that really does _not_ feel like I’m going
down a good path.
I’m hand-jamming a proof of concept just to validate the above, but having a
hard time lining up the data types… (e.g. the database has a char(2) field;
trying char, trying varchar, …)

The other SQL “insert-able” processors seem to want to read a file instead of a
flow, but I could easily be missing something.
Suggestions would be appreciated!

Question about NiFi and bulk database inserts - is PutSQL the only out of the box option?

Reply via email to