Im having some trouble with multiple splits/merges. Here's the idea:
Big data -> split 1->Save all the fragment.*attributes into variables -> split
2-> save all the fragment.* attributes
|
Split 1
|
Save fragment.* attributes into split1.fragment.*
|
Split 2
|
Save fragment.* attributes into split2.fragment.* attributes
|
(More processing)
|
Split 3
|
Save fragment.* attributes into split3.fragment.* attributes
|
(other stuff)
|
Restore split3.fragment.* attributes to fragment.*
|
Merge3, using defragment strategy
|
Restore split2.fragment.* attributes to fragment.*
|
Merge 2 using defragment strategy
|
Restore split1.frragment.* attributes to fragment.*
|
Merge 1 using defragment strategy
Am I thinking about this correctly? It seems like sometimes, nifi is unable to
do a merge on some of the split data (errors like "there are 50 fragments, but
we only found one). Is it possible that I need to do some prioritization in
the queues? I have noticed that my things do back up and the queues seem to
fill up as its going through (several of the splits need to perform rest calls
and processing, which can take time. Maybe the issue is that one fragment
"slips" through, before the others have even been processed far enough. Is
there an approved way to do this?
Thanks for the help!