Matt,
Thank you for those links, they should give me a good starting point.

Michael


On Wed, Aug 10, 2016 at 4:21 PM, Matt Burgess <[email protected]> wrote:

> Michael,
>
> There are a handful of examples of ExecuteScript using Javascript
> and/or Jython, on my blog (http://funnifi.blogspot.com) and other
> locations:
>
> Javascript:
> http://funnifi.blogspot.com/2016/03/executescript-json-to-
> json-revisited.html
> https://mail-archives.apache.org/mod_mbox/nifi-users/
> 201603.mbox/%3CCALhfc-WwqmZ7RMkRt2qxfgDnH1feHdM0o_
> [email protected]%3E
>
> Jython:
> http://funnifi.blogspot.com/2016/03/executescript-json-to-
> json-revisited_14.html
> https://community.hortonworks.com/articles/35568/python-
> script-in-nifi.html
> https://mail-archives.apache.org/mod_mbox/nifi-users/
> 201602.mbox/%3CCAEV8zdWm7_E-qC1KKHV8eW8CP0HZaEkwjC=
> [email protected]%3E
>
> I'm happy to help get you going with a scripted solution if you like.
>
> Regards,
> Matt
>
> On Wed, Aug 10, 2016 at 4:12 PM, Michael Xu <[email protected]> wrote:
> > Mark,
> > The expression you suggested seems to be working. I don't think the file
> > names I'm working with will have a comma, so this should be a good
> solution.
> >
> > Joe,
> > Are you referring to the ExecuteScript processor? That looks like a good
> > alternative. However, I couldn't find much information for it in the
> > documentation (https://nifi.apache.org/docs.html). Is there anywhere I
> can
> > find simple examples, especially in Javascript or Python?
> >
> > Thank you,
> > Michael
> >
> > On Wed, Aug 10, 2016 at 3:44 PM, Joe Witt <[email protected]> wrote:
> >>
> >> Probably a good idea to use a script in a script processor to extract
> >> the details needed about the splits then feed those results into merge
> >> attribute as you suggested.  This would be the safest/cleanest.
> >>
> >> On Wed, Aug 10, 2016 at 3:42 PM, Mark Payne <[email protected]>
> wrote:
> >> > Michael,
> >> >
> >> > Well, sort of...
> >> >
> >> > You could use:
> >> > ${allDelineatedValues('${fileArray}', ','):count()}
> >> >
> >> > So that will split up the fileArray attribute by commas and then count
> >> > them.
> >> > The only issue is that if you were to have a filename with a comma in
> >> > it,
> >> > you'd get the wrong value. Given that your'e not likely to have a
> >> > filename
> >> > with a comma, you may be all right, but it's not really the "cleanest"
> >> > solution...
> >> >
> >> > The Expression language does allow you to evaluate JSONPath against an
> >> > attribute but JSONPath doesn't allow for the nice functions that you
> can
> >> > get
> >> > in XPath and similar.
> >> >
> >> > Anyone else have any better ideas?
> >> >
> >> >
> >> > On Aug 10, 2016, at 3:32 PM, Michael Xu <[email protected]>
> wrote:
> >> >
> >> > Hi Mark,
> >> >
> >> > Thanks for your response earlier. While trying to implement what you
> >> > suggested in your email, I came across the issue of updating
> >> > fragment.count
> >> > on a per-flowfile basis. I have another attribute called fileArray,
> >> > which is
> >> > a json-compatible array that contains all the files for a particular
> >> > groupId. As an example taken from the Bulletin:
> >> >
> >> > Key: 'fileArray'
> >> >         Value: '["file1.txt","file2.txt","file3.txt"]'
> >> >
> >> > Is it possible in UpdateAttribute to use the Expression Language to
> >> > return
> >> > the length of this array?
> >> >
> >> > Thanks for your help,
> >> > Michael
> >> >
> >> > On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <[email protected]>
> >> > wrote:
> >> >>
> >> >> Michael,
> >> >>
> >> >> In the MergeContent processor, you can set the "Merge Strategy" to
> >> >> "Defragment." This will tell Merge Content to
> >> >> determine its bin thresholds based on the following FlowFile
> >> >> attributes:
> >> >>
> >> >> fragment.identifier
> >> >> fragment.index
> >> >> fragment.count
> >> >>
> >> >> So you'd need to set those 3 attributes on each of the FlowFiles.
> >> >> Rather
> >> >> than using the Correlation Attribute Name,
> >> >> you'd set the "fragment.identifier" attribute (you can use
> >> >> UpdateAttribute
> >> >> to copy the value from the groupId attribute
> >> >> to the 'fragment.identifier' attribute if you need to).
> >> >>
> >> >> The "fragment.index" attribute tells MergeContent how to order the
> >> >> different FlowFiles in the merged bin.
> >> >>
> >> >> The "fragment.count" attribute tells MergeContent how many FlowFiles
> go
> >> >> this bin.
> >> >>
> >> >> Does that all make sense?
> >> >>
> >> >> Thanks
> >> >> -Mark
> >> >>
> >> >>
> >> >> On Aug 10, 2016, at 10:54 AM, Michael Xu <[email protected]>
> wrote:
> >> >>
> >> >> I am sending into the MergeContent processor, payloads that each
> belong
> >> >> in
> >> >> a certain group of files in some data I'm working with. Each payload
> >> >> has an
> >> >> attribute called "groupId" which is an identification number for a
> >> >> particular group of files. This is the attribute I'm using to bin
> each
> >> >> incoming flowfile, and have set the Correlation Attribute Name to
> >> >> groupId.
> >> >>
> >> >>
> >> >>
> >> >> The problem I'm dealing with right now is that each groupId has a
> >> >> varying
> >> >> number of files associated with it. As such, I'm not sure how in NiFi
> >> >> to
> >> >> detect when the MergeContent processor has received all files for a
> >> >> particular groupId, and once done, release the bin.
> >> >>
> >> >>
> >> >>
> >> >> Any help with this problem is appreciated, thanks!
> >> >>
> >> >>
> >> >
> >> >
> >
> >
>

Reply via email to