As a backup to that, you can also write a Groovy script for ExecuteScript
that uses stax to iterate over the XML data. It won't care about schemas
(Avro or XML) and stuff like that; just check for basic validity.

On Fri, Oct 26, 2018 at 11:42 AM Joe Witt <joe.w...@gmail.com> wrote:

> Cant your logic detect the strange characters and then apply its
> behavior?  Alternatively, you could perhaps use ValidateRecord and
> have its reader only understand the good records.  It should kick out
> the bad records and you can then apply deeper processing on them.
>
> Thanks
> On Fri, Oct 26, 2018 at 11:36 AM Shawn Weeks <swe...@weeksconsulting.us>
> wrote:
> >
> > Is there anyway for a ScriptedRecordReader to set an attribute on a
> FlowFile when there is an error? Have a situation where I've written a
> groovy script to parse xml into a specific record structure and
> occasionally the incoming data has characters not allowed in XML.
> Unfortunately the system that generates the XML is doing it through string
> manipulation instead of actually understanding XML so it crams all kinds of
> junk characters in the data. I'd rather not scrub every file as some of
> them can be large so I was trying to figure out a way to only scrub them on
> exception.
> >
> >
> > Thanks
> >
> > Shawn Weeks
>

Reply via email to