Re: BufferedReader best option to search through large flowfiles?

James McMahon Mon, 05 Jun 2023 05:49:46 -0700

Thank you very much Mark and Lars. Ideally I do prefer to employ standard
"out of the box" processors. In this case my requirement is to identify
bounding dates across all content in the flowfile. As I match my DT
patterns, I'll add the tokens to a groovy list that I can later sort and
use to identify the extreme values. (I may actually throw out the extremes
to ensure I'm not working with an outlier that is an error). I know how to
make those manipulations in a groovy script. I don't know how to accomplish
them using standard processors.

Mark, for future reference is there a risk when using RouteText that a huge
flowfile might exhaust jvm or repo resources? Is there such a risk for the
ExtractText, ReplaceText, and RouteOnContent processors mentioned by Lars?

Jim

On Mon, Jun 5, 2023 at 8:25 AM Mark Payne <[email protected]> wrote:

> Jim,
>
> Take a look at RouteText.
>
> Thanks
> -Mark
>
>
> > On Jun 5, 2023, at 8:09 AM, James McMahon <[email protected]> wrote:
> >
> > Hello. I have a requirement to scan for multiple regex patterns in very
> large flowfiles. Given that my flowfiles can be very large, I think my best
> approach is to employ an ExecuteGroovyScript processor and a script using a
> BufferedReader to scan the file one line at a time.
> >
> > I am concerned that I might exhaust jvm resources trying to otherwise
> process large content if I try to handle it all at once. Is a
> BufferedReader the right call? Does anyone recommend a better approach?
> >
> > Thanks in advance,
> > Jim
>
>

Re: BufferedReader best option to search through large flowfiles?

Reply via email to