[
https://issues.apache.org/jira/browse/NIFI-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571961#comment-14571961
]
Joseph Witt commented on NIFI-378:
----------------------------------
I'd like to leave the decision of how to disposition the ticket to you as the
reporter ideally.
Just want to make sure that I understand your scenario and I'm not quite there
yet. Perhaps the terminology is tricky as we seem to be mixing implementation
and use case constructs a bit. In the notion of a 'queue' from the processor
point of view there is really only a single queue where a queue is 'a place
from which flowfiles are available'. It is true that there is no known
ordering of that queue from the perspective of the processor itself. But in
the case of defragment mode that is ok because it uses the index of flowfiles
as found within a given identifier as its mechanism for establishing order
after the flowfiles are pulled from the queue.
What I believe you've described is a scenario where there can be multiple
flowfiles in the queue of the processor at the same time which share the same
identifier and index. That is what I'm saying sounds like a violation of the
contract of this processor. I'd like to make sure I have that understanding
right and then keep peeling this onion back to get to the root issue or to help
understand what could or should be done in merge content's defragment mode to
help this case. Or whether we can provide documentation for the specific use
case you have to help a dataflow manager set the flow up to handle that
scenario.
Thanks
Joe
> MergeContent in Defragment mode will merge fragments without checking index
> ---------------------------------------------------------------------------
>
> Key: NIFI-378
> URL: https://issues.apache.org/jira/browse/NIFI-378
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 0.0.1
> Reporter: Michael Moser
> Priority: Minor
>
> When in Defragment mode, the MergeContent processor looks for
> fragment.identifier and fragment.count attributes in order to place FlowFiles
> in the correct bin. The fragment.index attribute is ignored.
> If you happen to have many FlowFile in the queue to MergeContent, and they
> all have fragment.identifier=foo and fragment.count=2, then it will merge two
> FlowFiles that have fragment.index=1 or it will merge two FlowFiles that have
> fragment.index=2.
> Granted this may seem odd. The use case is to give the MergeContent
> processor two input queues. We configure one queue to contain files with
> fragment.index=1 and the other queue to contain files with fragment.index=2.
> We want one file from each queue to be merged. Instead it will merge two
> files from the same queue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)