One approach is to use another CAS Multiplier for the merge ... it would take in the N chunks and produce an output CAS only when the N-th has been processed. Any later processing would be independent of the chunking that preceded it. This merging CM could also handle any out-of-order segments that can occur if you scale out your annotators. The CasCopier class makes it relatively easy to copy all FeatureStructures and update their offsets as necessary.
Burn.
