As a backup to that, you can also write a Groovy script for ExecuteScript that uses stax to iterate over the XML data. It won't care about schemas (Avro or XML) and stuff like that; just check for basic validity.
On Fri, Oct 26, 2018 at 11:42 AM Joe Witt <joe.w...@gmail.com> wrote: > Cant your logic detect the strange characters and then apply its > behavior? Alternatively, you could perhaps use ValidateRecord and > have its reader only understand the good records. It should kick out > the bad records and you can then apply deeper processing on them. > > Thanks > On Fri, Oct 26, 2018 at 11:36 AM Shawn Weeks <swe...@weeksconsulting.us> > wrote: > > > > Is there anyway for a ScriptedRecordReader to set an attribute on a > FlowFile when there is an error? Have a situation where I've written a > groovy script to parse xml into a specific record structure and > occasionally the incoming data has characters not allowed in XML. > Unfortunately the system that generates the XML is doing it through string > manipulation instead of actually understanding XML so it crams all kinds of > junk characters in the data. I'd rather not scrub every file as some of > them can be large so I was trying to figure out a way to only scrub them on > exception. > > > > > > Thanks > > > > Shawn Weeks >