Brian, You can use MergeContent in Defragment mode. Just be sure to set the number of bins used by MergeContent equal to or greater than the number of concurrent merges you expect to have going on in your flow, and to route successfully processed and failed flowfiles (after they've been gracefully handled, however it suits your use case) to the MergeContent processor. If a fragment (one of the child flowfiles) is not sent to MergeContent, it will never be able to complete the defragmentation since MergeContent would not have received all the fragments.
UnpackContent keeps track of the "batch" of files that are unpacked from the original archive by assigning to each child flowfile a set of fragment attributes that provide an ID to correlate merging (defragmenting in this case), the total number of fragments, and the fragment index. After the merge is complete, you'll have a recreation of the original zip file, and it signifies that all the child flowfiles have completed processing. - Jeff On Thu, Dec 22, 2016 at 12:29 PM BD International < [email protected]> wrote: > Hello, > > I've got a data flow which picks up a zip file and uses UnpackContent to > extract the contents. The subsequent files are them converted to json and > stored in a database. > > I would like to store the original zip file and only delete the file once > all the extracted files have been stored correctly, has anyone else come > across a way to do this? > > Thanks in advance, > > Brian >
