Probably a good idea to use a script in a script processor to extract the details needed about the splits then feed those results into merge attribute as you suggested. This would be the safest/cleanest.
On Wed, Aug 10, 2016 at 3:42 PM, Mark Payne <[email protected]> wrote: > Michael, > > Well, sort of... > > You could use: > ${allDelineatedValues('${fileArray}', ','):count()} > > So that will split up the fileArray attribute by commas and then count them. > The only issue is that if you were to have a filename with a comma in it, > you'd get the wrong value. Given that your'e not likely to have a filename > with a comma, you may be all right, but it's not really the "cleanest" > solution... > > The Expression language does allow you to evaluate JSONPath against an > attribute but JSONPath doesn't allow for the nice functions that you can get > in XPath and similar. > > Anyone else have any better ideas? > > > On Aug 10, 2016, at 3:32 PM, Michael Xu <[email protected]> wrote: > > Hi Mark, > > Thanks for your response earlier. While trying to implement what you > suggested in your email, I came across the issue of updating fragment.count > on a per-flowfile basis. I have another attribute called fileArray, which is > a json-compatible array that contains all the files for a particular > groupId. As an example taken from the Bulletin: > > Key: 'fileArray' > Value: '["file1.txt","file2.txt","file3.txt"]' > > Is it possible in UpdateAttribute to use the Expression Language to return > the length of this array? > > Thanks for your help, > Michael > > On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <[email protected]> wrote: >> >> Michael, >> >> In the MergeContent processor, you can set the "Merge Strategy" to >> "Defragment." This will tell Merge Content to >> determine its bin thresholds based on the following FlowFile attributes: >> >> fragment.identifier >> fragment.index >> fragment.count >> >> So you'd need to set those 3 attributes on each of the FlowFiles. Rather >> than using the Correlation Attribute Name, >> you'd set the "fragment.identifier" attribute (you can use UpdateAttribute >> to copy the value from the groupId attribute >> to the 'fragment.identifier' attribute if you need to). >> >> The "fragment.index" attribute tells MergeContent how to order the >> different FlowFiles in the merged bin. >> >> The "fragment.count" attribute tells MergeContent how many FlowFiles go >> this bin. >> >> Does that all make sense? >> >> Thanks >> -Mark >> >> >> On Aug 10, 2016, at 10:54 AM, Michael Xu <[email protected]> wrote: >> >> I am sending into the MergeContent processor, payloads that each belong in >> a certain group of files in some data I'm working with. Each payload has an >> attribute called "groupId" which is an identification number for a >> particular group of files. This is the attribute I'm using to bin each >> incoming flowfile, and have set the Correlation Attribute Name to groupId. >> >> >> >> The problem I'm dealing with right now is that each groupId has a varying >> number of files associated with it. As such, I'm not sure how in NiFi to >> detect when the MergeContent processor has received all files for a >> particular groupId, and once done, release the bin. >> >> >> >> Any help with this problem is appreciated, thanks! >> >> > >
