Re: Most efficient means to search for a character in flowFiles

2019-03-13 Thread Joe Witt
James For the problem as you described it the processor you definitely want is https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.0/org.apache.nifi.processors.standard.ScanContent/index.html It is a rather impressively fast implementation of a string search

Re: Most efficient means to search for a character in flowFiles

2019-03-13 Thread Mark Payne
RouteOnContent may be a good solution. ScanContent is probably a more efficient recommendation, though. ReplaceText would probably also work well, but you don't want to use Evaluation Mode of Entire Text - you're buffering the content of the entire FlowFile into memory and running a regex over

Re: Most efficient means to search for a character in flowFiles

2019-03-13 Thread Matt Burgess
Jim, Did you try RouteOnContent? I'm curious to see if it is faster for your use case than ReplaceText (and hopefully it'd be faster than Jython). Regards, Matt On Wed, Mar 13, 2019 at 11:56 AM James McMahon wrote: > > Wanted to follow up my question with what I eventually settled on. > >

Re: Most efficient means to search for a character in flowFiles

2019-03-13 Thread James McMahon
Wanted to follow up my question with what I eventually settled on. *Goal*: identify flowFiles that contained within their payload null characters, \x00. Wanted the solution to be fast so that it did not create a bottleneck in my flow. *First attempt*: Use ReplaceText. Replace all nulls with

Re: set penalty duration on a flowfile

2019-03-13 Thread Boris Tyukin
nice, thanks Andy! On Tue, Mar 12, 2019 at 3:56 PM Andy LoPresto wrote: > The code that currently penalizes a flowfile is below [1]. It reads > directly from the context containing the processor’s configured (static) > penalty duration. You could manually perform these steps (using your >