Probably a good idea to use a script in a script processor to extract
the details needed about the splits then feed those results into merge
attribute as you suggested.  This would be the safest/cleanest.

On Wed, Aug 10, 2016 at 3:42 PM, Mark Payne <[email protected]> wrote:
> Michael,
>
> Well, sort of...
>
> You could use:
> ${allDelineatedValues('${fileArray}', ','):count()}
>
> So that will split up the fileArray attribute by commas and then count them.
> The only issue is that if you were to have a filename with a comma in it,
> you'd get the wrong value. Given that your'e not likely to have a filename
> with a comma, you may be all right, but it's not really the "cleanest"
> solution...
>
> The Expression language does allow you to evaluate JSONPath against an
> attribute but JSONPath doesn't allow for the nice functions that you can get
> in XPath and similar.
>
> Anyone else have any better ideas?
>
>
> On Aug 10, 2016, at 3:32 PM, Michael Xu <[email protected]> wrote:
>
> Hi Mark,
>
> Thanks for your response earlier. While trying to implement what you
> suggested in your email, I came across the issue of updating fragment.count
> on a per-flowfile basis. I have another attribute called fileArray, which is
> a json-compatible array that contains all the files for a particular
> groupId. As an example taken from the Bulletin:
>
> Key: 'fileArray'
>         Value: '["file1.txt","file2.txt","file3.txt"]'
>
> Is it possible in UpdateAttribute to use the Expression Language to return
> the length of this array?
>
> Thanks for your help,
> Michael
>
> On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <[email protected]> wrote:
>>
>> Michael,
>>
>> In the MergeContent processor, you can set the "Merge Strategy" to
>> "Defragment." This will tell Merge Content to
>> determine its bin thresholds based on the following FlowFile attributes:
>>
>> fragment.identifier
>> fragment.index
>> fragment.count
>>
>> So you'd need to set those 3 attributes on each of the FlowFiles. Rather
>> than using the Correlation Attribute Name,
>> you'd set the "fragment.identifier" attribute (you can use UpdateAttribute
>> to copy the value from the groupId attribute
>> to the 'fragment.identifier' attribute if you need to).
>>
>> The "fragment.index" attribute tells MergeContent how to order the
>> different FlowFiles in the merged bin.
>>
>> The "fragment.count" attribute tells MergeContent how many FlowFiles go
>> this bin.
>>
>> Does that all make sense?
>>
>> Thanks
>> -Mark
>>
>>
>> On Aug 10, 2016, at 10:54 AM, Michael Xu <[email protected]> wrote:
>>
>> I am sending into the MergeContent processor, payloads that each belong in
>> a certain group of files in some data I'm working with. Each payload has an
>> attribute called "groupId" which is an identification number for a
>> particular group of files. This is the attribute I'm using to bin each
>> incoming flowfile, and have set the Correlation Attribute Name to groupId.
>>
>>
>>
>> The problem I'm dealing with right now is that each groupId has a varying
>> number of files associated with it. As such, I'm not sure how in NiFi to
>> detect when the MergeContent processor has received all files for a
>> particular groupId, and once done, release the bin.
>>
>>
>>
>> Any help with this problem is appreciated, thanks!
>>
>>
>
>

Reply via email to