Sudeep,
You need to be cautious when extracted the entire contents of a file
in to an attribute. Attributes are stored in JVM memory. Having
exceptionally large attributes will consume considerable amounts of that
memory. To use the extractText processor to grab the entire content, you
first need to set/adjust teh following properties:
- Maximum Buffer Size <-- default is 1 MB but needs to be large enough to
accommodate the entire file.
- Maximum Capture Group Size <-- in your case since your capture group
will be the entire file, this also must be large enough to handle entire
content. if set to low and characters beyond they set file will be
truncated.
- Enable DOTALL Mode <-- needs to be set to true so that line returns are
matched by your capture group as well.
- Include Capture Group 0 <-- you should set this to False to lessen your
JVM memory footprint here.
- Finally you need to add a "New property" which will contain your capture
group
- for example:
property name: MyContent
value: (.*)
The above value is a Java regular expression contained in a capture group.
Matt
On Wed, Feb 24, 2016 at 9:22 AM, sudeep mishra <[email protected]>
wrote:
> Hi,
>
> Can someone please guide how to use the ExtractText processor to add
> entire flowfile content to an attribute?
>
>
> Thanks & Regards,
>
> Sudeep
>