Wanted to add some additional details, 1) The tables (customer and driving license are in two different data bases 2) The solutions works for less number of records (~100 or so). The dead lock issue occurs when we tested for 1M records.
On Thu, Mar 19, 2020 at 1:07 PM Samarendra Sahoo <[email protected]> wrote: > Thanks a lot Emanuel. Below are the details we have done and the dead lock > issue we are facing > > *Problem Statement & Requirement: * > > Our Use case is to migrate data from MSSQL to MSSQL using Nifi. As part of > that, need to load Customer table (with customer_id being a surrogate > key/auto increment ID and gets generated while we load data) and > DRIVING_LICENSE table. While loading DRIVING_LICENSE table, need to > populate customer_id based on SSN present in DRIVING_LICENSE table. > > > > *Approach* > > Step1 - load Customer table with surrogate key (customer_id) to target. > > Step2- load TGT_DRIVING_LICENSE table with surrogate key > (driving_license_id) to target. > > Step3 - select SSN and customer_id from target Customer table, have it in > memory. There is no physical staging table here. > > Step4 - Update customer_id in target TGT_DRIVING_LICENSE table, based on > SSN column. > > > Attached all the Nifi processing steps and the error we are getting. > > > Best regards > > Sam > > > > On Sat, Mar 14, 2020 at 9:42 PM Emanuel Oliveira <[email protected]> > wrote: > >> Hi, >> >> Please share processors you using, as its not clear how you implementing >> what you saying in english: >> - load two tables ? what does that mean? what processor you using to both >> create/prepare data + what processor you using to "load" ? >> - etc.. >> >> After we better understand, I can better help you.. but not its simply >> too vague. >> >> Best Regards, >> *Emanuel Oliveira* >> >> >> >> On Sat, Mar 14, 2020 at 8:27 AM Samarendra Sahoo < >> [email protected]> wrote: >> >>> >>> Hello, >>> We have use case where we have to load two tables say Customer (here >>> customer ID is a sequence and gets generated while we load data) and >>> purchase_order. While loading purchase_order need to populate customer_id >>> based on SSN present in the purchase_order table. Since there is this >>> dependency, trying to create this in one process group with Step1 - load >>> customer, step2 - load purchase order with dummy customer_id, step 3 - join >>> purchase_order and customer based on ssn and populate customer_id in >>> purchase_order. >>> >>> While doing so, there are multiple flow files generated for customer >>> table as we are loading this data based on partition. Would like to know, >>> how to trigger next processor only once, when all flow files are processed >>> by previous processor? >>> >>> Looking for help or if there are any better approaches to achieve this? >>> >>> Thanks >>> Sam >>> >>> >>>
