James
For the problem as you described it the processor you definitely want is
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.0/org.apache.nifi.processors.standard.ScanContent/index.html
It is a rather impressively fast implementation of a string search
RouteOnContent may be a good solution. ScanContent is probably a more efficient
recommendation, though.
ReplaceText would probably also work well, but you don't want to use Evaluation
Mode of Entire Text - you're buffering the content
of the entire FlowFile into memory and running a regex over
Jim,
Did you try RouteOnContent? I'm curious to see if it is faster for
your use case than ReplaceText (and hopefully it'd be faster than
Jython).
Regards,
Matt
On Wed, Mar 13, 2019 at 11:56 AM James McMahon wrote:
>
> Wanted to follow up my question with what I eventually settled on.
>
>
Wanted to follow up my question with what I eventually settled on.
*Goal*: identify flowFiles that contained within their payload null
characters, \x00. Wanted the solution to be fast so that it did not create
a bottleneck in my flow.
*First attempt*: Use ReplaceText. Replace all nulls with
nice, thanks Andy!
On Tue, Mar 12, 2019 at 3:56 PM Andy LoPresto wrote:
> The code that currently penalizes a flowfile is below [1]. It reads
> directly from the context containing the processor’s configured (static)
> penalty duration. You could manually perform these steps (using your
>