Michael,

There are a handful of examples of ExecuteScript using Javascript
and/or Jython, on my blog (http://funnifi.blogspot.com) and other
locations:

Javascript:
http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited.html
https://mail-archives.apache.org/mod_mbox/nifi-users/201603.mbox/%3ccalhfc-wwqmz7rmkrt2qxfgdnh1fehdm0o_hdztgwwfgn+z0...@mail.gmail.com%3E

Jython:
http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
https://community.hortonworks.com/articles/35568/python-script-in-nifi.html
https://mail-archives.apache.org/mod_mbox/nifi-users/201602.mbox/%3CCAEV8zdWm7_E-qC1KKHV8eW8CP0HZaEkwjC=rdvtqnj+i85c...@mail.gmail.com%3E

I'm happy to help get you going with a scripted solution if you like.

Regards,
Matt

On Wed, Aug 10, 2016 at 4:12 PM, Michael Xu <[email protected]> wrote:
> Mark,
> The expression you suggested seems to be working. I don't think the file
> names I'm working with will have a comma, so this should be a good solution.
>
> Joe,
> Are you referring to the ExecuteScript processor? That looks like a good
> alternative. However, I couldn't find much information for it in the
> documentation (https://nifi.apache.org/docs.html). Is there anywhere I can
> find simple examples, especially in Javascript or Python?
>
> Thank you,
> Michael
>
> On Wed, Aug 10, 2016 at 3:44 PM, Joe Witt <[email protected]> wrote:
>>
>> Probably a good idea to use a script in a script processor to extract
>> the details needed about the splits then feed those results into merge
>> attribute as you suggested.  This would be the safest/cleanest.
>>
>> On Wed, Aug 10, 2016 at 3:42 PM, Mark Payne <[email protected]> wrote:
>> > Michael,
>> >
>> > Well, sort of...
>> >
>> > You could use:
>> > ${allDelineatedValues('${fileArray}', ','):count()}
>> >
>> > So that will split up the fileArray attribute by commas and then count
>> > them.
>> > The only issue is that if you were to have a filename with a comma in
>> > it,
>> > you'd get the wrong value. Given that your'e not likely to have a
>> > filename
>> > with a comma, you may be all right, but it's not really the "cleanest"
>> > solution...
>> >
>> > The Expression language does allow you to evaluate JSONPath against an
>> > attribute but JSONPath doesn't allow for the nice functions that you can
>> > get
>> > in XPath and similar.
>> >
>> > Anyone else have any better ideas?
>> >
>> >
>> > On Aug 10, 2016, at 3:32 PM, Michael Xu <[email protected]> wrote:
>> >
>> > Hi Mark,
>> >
>> > Thanks for your response earlier. While trying to implement what you
>> > suggested in your email, I came across the issue of updating
>> > fragment.count
>> > on a per-flowfile basis. I have another attribute called fileArray,
>> > which is
>> > a json-compatible array that contains all the files for a particular
>> > groupId. As an example taken from the Bulletin:
>> >
>> > Key: 'fileArray'
>> >         Value: '["file1.txt","file2.txt","file3.txt"]'
>> >
>> > Is it possible in UpdateAttribute to use the Expression Language to
>> > return
>> > the length of this array?
>> >
>> > Thanks for your help,
>> > Michael
>> >
>> > On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <[email protected]>
>> > wrote:
>> >>
>> >> Michael,
>> >>
>> >> In the MergeContent processor, you can set the "Merge Strategy" to
>> >> "Defragment." This will tell Merge Content to
>> >> determine its bin thresholds based on the following FlowFile
>> >> attributes:
>> >>
>> >> fragment.identifier
>> >> fragment.index
>> >> fragment.count
>> >>
>> >> So you'd need to set those 3 attributes on each of the FlowFiles.
>> >> Rather
>> >> than using the Correlation Attribute Name,
>> >> you'd set the "fragment.identifier" attribute (you can use
>> >> UpdateAttribute
>> >> to copy the value from the groupId attribute
>> >> to the 'fragment.identifier' attribute if you need to).
>> >>
>> >> The "fragment.index" attribute tells MergeContent how to order the
>> >> different FlowFiles in the merged bin.
>> >>
>> >> The "fragment.count" attribute tells MergeContent how many FlowFiles go
>> >> this bin.
>> >>
>> >> Does that all make sense?
>> >>
>> >> Thanks
>> >> -Mark
>> >>
>> >>
>> >> On Aug 10, 2016, at 10:54 AM, Michael Xu <[email protected]> wrote:
>> >>
>> >> I am sending into the MergeContent processor, payloads that each belong
>> >> in
>> >> a certain group of files in some data I'm working with. Each payload
>> >> has an
>> >> attribute called "groupId" which is an identification number for a
>> >> particular group of files. This is the attribute I'm using to bin each
>> >> incoming flowfile, and have set the Correlation Attribute Name to
>> >> groupId.
>> >>
>> >>
>> >>
>> >> The problem I'm dealing with right now is that each groupId has a
>> >> varying
>> >> number of files associated with it. As such, I'm not sure how in NiFi
>> >> to
>> >> detect when the MergeContent processor has received all files for a
>> >> particular groupId, and once done, release the bin.
>> >>
>> >>
>> >>
>> >> Any help with this problem is appreciated, thanks!
>> >>
>> >>
>> >
>> >
>
>

Reply via email to