Anil, Is this a "one-time" migration, meaning you would take the on-prem tables and put them on the cloud DB just once? Or would this be an incremental operation, where you do the initial move and then take any "new" rows from the source and apply them to the target? For the latter, there are a couple of options:
1) Rebuild the cloud DB periodically. You can use ExecuteSQL -> [processors] -> PutSQL after perhaps deleting your target DB/tables/etc. This could be time-consuming and expensive. The processors in-between probably include ConvertAvroToJSON and ConvertJSONToSQL. 2) Use QueryDatabaseTable or (GenerateTableFetch -> ExecuteSQL) to get the source data. For this your table would need a column whose values always increase, that column would comprise the value of the "Maximum Value Column" property in the aforementioned processors' configuration dialogs. You would need one QueryDatabaseTable or GenerateTableFetch for each table in your DB. In addition to these current solutions, as of the upcoming NiFi 1.2.0 release, you have the following options: 3) If the source database is MySQL, you can use the CaptureChangeMySQL processor to get binary log events flowing through various processors into PutDatabaseRecord to place them at the source. This pattern is true Change Data Capture (CDC) versus the other two options above. 4) Option #2 will be improved such that GenerateTableFetch will accept incoming flow files, so you can use ListDatabaseTables -> GenerateTableFetch -> ExecuteSQL to handle multiple tables with one flow. If this is a one-time migration, a data flow tool might not be the best choice, you could consider something like Flyway [1] instead. Regards, Matt [1] https://flywaydb.org/documentation/command/migrate On Tue, May 2, 2017 at 10:41 AM, Anil Rai <[email protected]> wrote: > I have a simple use case. > > DB (On Premise) and DB (On Cloud). > > I want to use nifi to extract data from on prem DB (huge volumes) and > insert into the same table structure that is hosted on cloud. > > I could use ExecuteSQL on both sides of the fence (to extract from on prem > and insert onto cloud). What processors are needed in between (if at all)? > As i am not doing any transformations at all....it is just extract and load > use case
