Michael, There are a handful of examples of ExecuteScript using Javascript and/or Jython, on my blog (http://funnifi.blogspot.com) and other locations:
Javascript: http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited.html https://mail-archives.apache.org/mod_mbox/nifi-users/201603.mbox/%3ccalhfc-wwqmz7rmkrt2qxfgdnh1fehdm0o_hdztgwwfgn+z0...@mail.gmail.com%3E Jython: http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html https://community.hortonworks.com/articles/35568/python-script-in-nifi.html https://mail-archives.apache.org/mod_mbox/nifi-users/201602.mbox/%3CCAEV8zdWm7_E-qC1KKHV8eW8CP0HZaEkwjC=rdvtqnj+i85c...@mail.gmail.com%3E I'm happy to help get you going with a scripted solution if you like. Regards, Matt On Wed, Aug 10, 2016 at 4:12 PM, Michael Xu <[email protected]> wrote: > Mark, > The expression you suggested seems to be working. I don't think the file > names I'm working with will have a comma, so this should be a good solution. > > Joe, > Are you referring to the ExecuteScript processor? That looks like a good > alternative. However, I couldn't find much information for it in the > documentation (https://nifi.apache.org/docs.html). Is there anywhere I can > find simple examples, especially in Javascript or Python? > > Thank you, > Michael > > On Wed, Aug 10, 2016 at 3:44 PM, Joe Witt <[email protected]> wrote: >> >> Probably a good idea to use a script in a script processor to extract >> the details needed about the splits then feed those results into merge >> attribute as you suggested. This would be the safest/cleanest. >> >> On Wed, Aug 10, 2016 at 3:42 PM, Mark Payne <[email protected]> wrote: >> > Michael, >> > >> > Well, sort of... >> > >> > You could use: >> > ${allDelineatedValues('${fileArray}', ','):count()} >> > >> > So that will split up the fileArray attribute by commas and then count >> > them. >> > The only issue is that if you were to have a filename with a comma in >> > it, >> > you'd get the wrong value. Given that your'e not likely to have a >> > filename >> > with a comma, you may be all right, but it's not really the "cleanest" >> > solution... >> > >> > The Expression language does allow you to evaluate JSONPath against an >> > attribute but JSONPath doesn't allow for the nice functions that you can >> > get >> > in XPath and similar. >> > >> > Anyone else have any better ideas? >> > >> > >> > On Aug 10, 2016, at 3:32 PM, Michael Xu <[email protected]> wrote: >> > >> > Hi Mark, >> > >> > Thanks for your response earlier. While trying to implement what you >> > suggested in your email, I came across the issue of updating >> > fragment.count >> > on a per-flowfile basis. I have another attribute called fileArray, >> > which is >> > a json-compatible array that contains all the files for a particular >> > groupId. As an example taken from the Bulletin: >> > >> > Key: 'fileArray' >> > Value: '["file1.txt","file2.txt","file3.txt"]' >> > >> > Is it possible in UpdateAttribute to use the Expression Language to >> > return >> > the length of this array? >> > >> > Thanks for your help, >> > Michael >> > >> > On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <[email protected]> >> > wrote: >> >> >> >> Michael, >> >> >> >> In the MergeContent processor, you can set the "Merge Strategy" to >> >> "Defragment." This will tell Merge Content to >> >> determine its bin thresholds based on the following FlowFile >> >> attributes: >> >> >> >> fragment.identifier >> >> fragment.index >> >> fragment.count >> >> >> >> So you'd need to set those 3 attributes on each of the FlowFiles. >> >> Rather >> >> than using the Correlation Attribute Name, >> >> you'd set the "fragment.identifier" attribute (you can use >> >> UpdateAttribute >> >> to copy the value from the groupId attribute >> >> to the 'fragment.identifier' attribute if you need to). >> >> >> >> The "fragment.index" attribute tells MergeContent how to order the >> >> different FlowFiles in the merged bin. >> >> >> >> The "fragment.count" attribute tells MergeContent how many FlowFiles go >> >> this bin. >> >> >> >> Does that all make sense? >> >> >> >> Thanks >> >> -Mark >> >> >> >> >> >> On Aug 10, 2016, at 10:54 AM, Michael Xu <[email protected]> wrote: >> >> >> >> I am sending into the MergeContent processor, payloads that each belong >> >> in >> >> a certain group of files in some data I'm working with. Each payload >> >> has an >> >> attribute called "groupId" which is an identification number for a >> >> particular group of files. This is the attribute I'm using to bin each >> >> incoming flowfile, and have set the Correlation Attribute Name to >> >> groupId. >> >> >> >> >> >> >> >> The problem I'm dealing with right now is that each groupId has a >> >> varying >> >> number of files associated with it. As such, I'm not sure how in NiFi >> >> to >> >> detect when the MergeContent processor has received all files for a >> >> particular groupId, and once done, release the bin. >> >> >> >> >> >> >> >> Any help with this problem is appreciated, thanks! >> >> >> >> >> > >> > > >
