I’m trying to get bulk inserts going using PutSQL processor but it’s starting to get ugly so I need to reach out and see if any of you have been down this path.
If you have, here’s some info. If not, thanks for reading this far ☺ Background: Legacy database migration ETL task. Extract from one database, do a bunch of transformations, then load it all into a postgresql repo. We have 100’s of tables with obviously many record structures _and a ton of data_. According to: https://community.hortonworks.com/articles/91849/design-nifi-flow-for-using-putsql-processor-to-per.html PutSQL, to do batch inserts, seems to want the form of the SQL statement to be identical for each record type. e.g. Insert into Employee ("name", "job title") VALUES (?,?) Easy enough to build that but then it needs attributes for all the values and types in the flow. e.g. 1. sql.args.1.value = Bryan B 2. sql.args.2.value = Director Use Update Attribute Processor to set sql.args.N.type Flow file attributes 1. sql.args.1.type = 12 (VARCHAR) 2. sql.args.2.type = 12 THIS implies my flow will need to create a couple attributes for every single field in the dataflow – AND I’ll have to come up with logic to determine what the data type is… I’m a newbie at this nifi stuff but that really does _not_ feel like I’m going down a good path. I’m hand-jamming a proof of concept just to validate the above, but having a hard time lining up the data types… (e.g. the database has a char(2) field; trying char, trying varchar, …) The other SQL “insert-able” processors seem to want to read a file instead of a flow, but I could easily be missing something. Suggestions would be appreciated!
