Wanted to add some additional details,
1) The tables (customer and driving license are in two different data bases
2) The solutions works for less number of records (~100 or so). The dead
lock issue occurs when we tested for 1M records.

On Thu, Mar 19, 2020 at 1:07 PM Samarendra Sahoo <[email protected]>
wrote:

> Thanks a lot Emanuel. Below are the details we have done and the dead lock
> issue we are facing
>
> *Problem Statement & Requirement: *
>
> Our Use case is to migrate data from MSSQL to MSSQL using Nifi. As part of
> that, need to load Customer table (with customer_id being a surrogate
> key/auto increment ID and gets generated while we load data) and
> DRIVING_LICENSE table. While loading DRIVING_LICENSE table, need to
> populate customer_id based on SSN present in DRIVING_LICENSE table.
>
>
>
> *Approach*
>
> Step1 - load Customer table with surrogate key (customer_id) to target.
>
> Step2-  load TGT_DRIVING_LICENSE table with surrogate key
> (driving_license_id) to target.
>
> Step3 - select SSN and customer_id from target Customer table, have it in
> memory. There is no physical staging table here.
>
> Step4 - Update customer_id in target TGT_DRIVING_LICENSE table, based on
> SSN column.
>
>
> Attached all the Nifi processing steps and the error we are getting.
>
>
> Best regards
>
> Sam
>
>
>
> On Sat, Mar 14, 2020 at 9:42 PM Emanuel Oliveira <[email protected]>
> wrote:
>
>> Hi,
>>
>> Please share processors you using, as its not clear how you implementing
>> what you saying in english:
>> - load two tables ? what does that mean? what processor you using to both
>> create/prepare data + what processor you using to "load" ?
>> - etc..
>>
>> After we better understand, I can better help you.. but not its simply
>> too vague.
>>
>> Best Regards,
>> *Emanuel Oliveira*
>>
>>
>>
>> On Sat, Mar 14, 2020 at 8:27 AM Samarendra Sahoo <
>> [email protected]> wrote:
>>
>>>
>>> Hello,
>>> We have use case where we have to load two tables say Customer (here
>>> customer ID is a sequence and gets generated while we load data) and
>>> purchase_order. While loading purchase_order need to populate customer_id
>>> based on SSN present in the purchase_order table. Since there is this
>>> dependency, trying to create this in one process group with Step1 - load
>>> customer, step2 - load purchase order with dummy customer_id, step 3 - join
>>> purchase_order and customer based on ssn and populate customer_id in
>>> purchase_order.
>>>
>>> While doing so, there are multiple flow files generated for customer
>>> table as we are loading this data based on partition. Would like to know,
>>> how to trigger next processor only once, when all flow files are processed
>>> by previous processor?
>>>
>>> Looking for help or if there are any better approaches to achieve this?
>>>
>>> Thanks
>>> Sam
>>>
>>>
>>>

Reply via email to